Tumblelog by Soup.io
Newer posts are loading.
You are at the newest post.
Click here to check if anything new just came in.

July 01 2013

14:57

Monday Q&A: Denise Malan on the new data-driven collaboration between INN and IRE

Every news organization wishes it could have more reporters with data skills on staff. But not every news organization can afford to make data a priority — and even those that do can sometimes find the right candidates hard to find.

A new collaboration between two journalism nonprofits — the Investigative News Network and Investigative Reporters and Editors — aims to address this allocation issue. Denise Malan, formerly a investigative and data reporter at the Corpus Christi Caller-Times, will fill the new role of INN director of data services, offering “dedicated data-analysis services to INN’s membership of more than 80 nonprofit investigative news organizations,” many of them three- or four-person teams that can’t find room or funding for a dedicated data reporter.

It’s a development that could both strengthen the investigative work being done by these institutions and skill building around data analysis in journalism. Malan has experience in training journalists in skills of procuring, cleaning, and analyzing data, and she has high hopes for the kinds of stories and networked reporting that will be produced by this collaboration. We talked about IRE’s underutilized data library, potentially disruptive Supreme Court decisions around freedom of information, the unfortunate end for wildlife wandering onto airplane runways, and what it means to translate numbers into stories.

O’Donovan: How does someone end up majoring in physics and journalism?
Malan: My freshman year they started a program to do a bachelor of arts in physics. Physics Lite. And you could pair that with business or journalism or English — something that was really your major focus of study, but the B.A. in physics would give you a good science background. So you take physics, you take calculus, you take statistics, and that really gives you the good critical thinking and data background to pair with something else — in my case, journalism.
O’Donovan: I guess it’s kind of easy to see how that led into what you’re doing now. But did you always see them going hand in hand? Or is that something that came later?
Malan: In college, I thought I was going to be a science writer. That was the main reason I paired those. When I got into news and started going down the path of data journalism, I was very glad to have that background, for sure. But I started getting more into the data journalism world when the Caller-Times in Corpus Christi sent me to the IRE bootcamp, where it’s a weeklong, intensive week where you concentrate on learning Excel and Access and the different pitfalls you can face in data — some basic cleaning skils. That’s really what got me started in the data journalism realm. And then the newspaper continued to send me to training — to the CAR conferences every year and local community college classes to beef up my skills.
O’Donovan: So, how long were you at the Caller-Times?
Malan: I was there seven years. I started as a reporter in June 2006, and then moved up into editing in May of 2010.
O’Donovan: And in the time that you were there as their data person, what are some stories that you were particularly proud of, or made you feel like this was a a burgeoning field?
Malan: We focused on intensely local projects at the Caller-Times. One of the ones that I was really proud of I worked on with our city hall reporter Jessica Savage. She found out that the city streets are a huge issue in Corpus Christi. If you’ve ever driven here, you know they are just horrible — a disaster. And the city is trying to find a billion dollars to fix them.

So our city hall reporter found out that the city keeps a database of scores called the Pavement Condition Index. Basically, it’s the condition of your street. So we got that database and we merged it with a file of streets and color-coded it so people could fully see what the condition of their street was, and we put it a database for people to find their exact block. This was something the city did not want to give us at first, because if people know the condition of their street scores, they’re going to demand that we do something about it. We’re like, “Yeah, that’s kind of the idea.” But that database became the basis for an entire special section on our streets. We used it to find people on streets who scored a 0, and talked about how it effects their life — how often they have to repair their cars, how often they walk through giant puddles.

And then we paired it with a breakout box of every city council member and their score. We did a map online, which, for over a year, actually, has been a big hit while the city is discussing how they’re going to find this money. People have been using it as a basis for the debate that they’re having, which, to me, is really kind of how we make a difference. Using this data that the city had, bringing it to light, making it accessible, I think, has really just changed the debate here for people. So that’s one thing I’m really proud of — that we can give people information to make informed decisions.

O’Donovan: Part of your new position is going to be facilitating and assisting other journalists in starting to understand how to do this kind of work. How do you tell reporters that this isn’t scary — that it’s something they can do or they can learn? How do you begin that conversation?
Malan: [At the Caller-Times] we adopted the philosophy that data journalism isn’t just something that one nerdy person in the office does, but something that everyone in the newsroom should have in their toolbox. It really enhances eery beat at the newspaper.

I would do training sessions occasionally on Excel, Google Fusion Tables, Caspio to show everyone in the newsroom, “Here’s what’s possible.” Some people really pick up on it and take it and run with it. Some people are not as math oriented and are not going to be able to take it and run with it themselves, but at least they know those tools are available and what it’s possible to do with them.

So some of the reporters would be just aware of how we could analyze data and they would keep their eyes open for databases on their beats, and other reporters would run with it. That philosophy is very important in any newsroom today. A lot of what I’m going to be doing with IRE and INN is working with the INN members in helping them to gather the data and analyze it and inform their local reporting. So a lot of the same roles, but in a broader context.

O’Donovan: So a lot of it is understanding that everyone is going to come at it with a different skill level.
Malan: Yes, absolutely. All our members have different levels of skills. Some of our members have very highly skilled data teams, like ProPublica, Center for Public Integrity — they’re really at the forefront of data journalism. Other members are maybe one- or two-person newsrooms that may not have the training and don’t have any reporters with those skills. So the skill sets are all over the board. But it will be my job to help, especially smaller newsrooms, plug into those resources — especially the resources at IRE — the best they can, with the data library there and the training available there. We help them bring up their own skills and enhance their own reporting.
O’Donovan: When a reporter comes to you and says, “I just found this dataset or I just got access to it” — how do you dive into that information when it comes to looking for stories? How do you take all of that and start to look for what could turn into something interesting?
Malan: A lot of it depends on the data set. Just approach every set of data as a source that you’re interviewing. What is available there? What is maybe missing from the data is something you want to think about too? And you definitely want to narrow it down: A lot of data sets are huge, especially these federal data sets that might have records containing, I don’t know, 120 fields, but maybe you’re only interested in three of them. So you want to get to know the data set, and what is interesting in it, and you want to really narrow your focus.

One collaboration that INN did was using data gathered by NASA for the FAA, and it was essentially near misses — incidents at airports like hitting deer on the runway, and all these little things that can happen but aren’t necessarily reported. They all get compiled in this database, and pilots write these narratives about it, so that field is very interesting to them. There were four or five INN members who collaborated on that, and they all came away with different stories because they all found something else that was interesting for them locally.

O’Donovan: This position you’ll hold is about bringing the work of INN and IRE together. What’s that going to look like? We talk all the time about how journalism is moving in a more networked direction — where do you see this fitting into that?
Malan: IRE and INN have always had a very close relationship, and I think that this position just kind of formalizes that. I will be helping INN members plug into the resources of IRE, especially the data library, I’ll be working closely with Liz Lucas, the database director at IRE, and I’m actually going to be living near IRE so I can work more closely with them. Some of that data there is very underutilized and it’s really interesting and maybe hasn’t been used in any projects, especially on a national level.

So we can take that data and I can kind of help analyze it, help slice it for the various regions we might be looking at, and help the INN members use that data for their stories. I’ll basically be acting as almost a translator to get this data from the IRE and help the INN members use it.

Going the other way, with INN members, they might come up with some project idea where data isn’t available from the database library, or it might be something where we have to gather data from every state individually, so we might compile that and whatever we end up with will be sent back to the IRE library and made available to other IRE members. So it’s a two-way relationship.

O’Donovan: So in terms of managing this collaboration, what are the challenges? Are you think of building an interface for sharing data or documents?
Malan: We’re going to be setting up a kind of committee of data people with INN to have probably monthly calls and just discuss ideas, what they’re working on, brainstorming, possible ideas. I want it to be a very organic, ground-up process — I don’t want it to be dictating what the projects should be. I want the members to come up with their own ideas. So we’ll be brainstorming and coming up with things, and we’ll be managing the group through Basecamp and communicating that way. A lot of the other members are already on Basecamp and communicate that way through INN.

We’ll be communicating through this committee and coming up with ideas and I’l be working with other members to, to reach out to them. If we come up with an idea that deals with health care, for example, I might reach out to some of the members that are especially focused on health care and try to bring in other members on it.

O’Donovan: Do you foresee collaborations between members, like shared reporting and that kind of thing?
Malan: Yeah, depending on the project. Some of it might be shared reporting; some of it might be someone does a main interview. If we’re doing a crime story dealing with the FBI’s Uniformed Crime Report, maybe we just have one reporter from every property, we nominate one person to do the interview with the FBI that everyone can use in their own story, which they localize with their own data. So, yeah, depending on the project, we’ll have to kind of see how the reporting would shake out.
O’Donovan: Do you have any specific goals or types of stories you want to tell, or even just specific data sets you’re eager to get a look at?
Malan: I think there are several interesting sets in the IRE data library that we might go after at first. There’s really interesting health sets, for example, from the FDA — one of them is a database of adverse affects from drugs, complaints that people make that drugs have had adverse effects. So yeah, some of those can be right off the bat, ready to go and parse and analyze.

Some other data sets we might be looking at will be a little harder to get, will take some FOIs and some time to get. There are several major areas that our members focus on and that we’ll be looking at projects for. Environment, for example — fracking is a large issue, and how environment effects public health. Health care, especially with the Affordable Care Act coming into effect next year is going to be a large one. Politics, government, how money effects influences politicians is a huge area as we come up on the 2016 elections and the 2014 midterms. And education is another issue with achievement gaps, graduation rates, charter schools — those are all large issues that our members follow. Finding those commonalties and dealing with data sets, digging into that is going to be my first priority.

O’Donovan: The health question is interesting. Knight announced its next round of News Challenge grants is going to be all around health.
Malan: I’m excited about that. We have several members that are really specifically focused on healt,h so I feel like we might be able to get something good with that.
O’Donovan: Health care stuff or more public health stuff?
Malan: It’s a mix, but a lot of stuff is geared toward the Affordable Care Act now.
O’Donovan: Gathering these data sets must often involve a lot of coordination across states and jurisdictions.
Malan: Yeah, absolutely. One thing I am a little nervous about is the Supreme Court’s recent ruling in the Virginia case where they can now require you to live in a state to put in an FOI. That might complicate things a little bit. I know there are several groups working on lists of people who will put an FOI in for you in various states. But that can kind of just slow down the process and put a little kink in and add to the timeline. I’m concerned of course that now they know it’s been ruled constitutional that every state might make that the law. It could be a huge thing. A management nightmare.
O’Donovan: What kind of advice do you normally give to reporters who are struggling to get information that they know they should be allowed to have?
Malan: That’s something we encountered a lot here, especially getting data in the proper format, too. Laws on that can vary from state to state. A lot of governments will give you paper or PDF format, instead of the Excel or text file that you asked for. It’s always a struggle.

The advice is to know the law as best you can, know what exceptions are allowed under your state law, be able to quote — you don’t have to have the law memorized, but be able to quote specific sections that you know are on your side. Be prepared with your requests, and be prepared to fight for it. And in a lot of cases, it is a fight.

O’Donovan: That’s an interesting intersection of technical and legal skill. That’s a lot of education dollars right there.
Malan: Yeah, no kidding.
O’Donovan: When you do things like attend the NICAR conference and assess the scene more broadly, where do you see the most urgent gaps in the data journalism field? Is it that we need more data analysts? More computer scientists? More reporters with the fluency in communicating with government? More legal aid? If you could allocate more resources, where would you put them right now?
Malan: There’s always going to be a need for more very highly skilled data journalists who can gather these national sets, analyze them, clean them, get them into a digestible format, visualize them online, and inform readers. I would like to see more general beat reporters interested in data and at least getting skills in Excel and even Access — because the beat reporters are the ones on the ground, using their sources, finding these data sets or not finding them if they’re not aware of what data is. I would really like this to be a bigger push to at least educate most general beat reporters to a certain level.
O’Donovan: Where do you see the data journalism movement headed over the next couple years? What would your next big hope for the field be?
Malan: Well, of course I hope for it to go kind of mainstream, and that all reporters will have some sort of data skills. It’s of course harder with fewer and fewer resources, and reporters are learning how to tweet and Instagram, and there are demands on their time that have never been there.

But I would hope it would become just an normal part of journalism, that there would be no more “data journalism” — that it just becomes part of what we do, because it’s invaluable to reporting and to really helping ferret out the truth and to give context to stories.

June 27 2013

16:27

Sensor journalism, storytelling with Vine, fighting gender bias and more: Takeaways from the 2013 Civic Media Conference

mit-knight-civic-media-conference-2013Are there lessons journalists can learn from Airbnb? What can sensors tell us about the state of New York City’s public housing stock? How can nonprofits, governments, and for-profit companies collaborate to create places for public engagement online?

There were just a few of the questions asked at the annual Civic Media Conference hosted by MIT and the Knight Foundation in Cambridge this week. It covered a diverse mix of topics, ranging from government transparency and media innovation to disaster relief and technology’s influence on immigration issues. (For a helpful summary of the event’s broader themes check out VP of journalism and innovation Michael Maness‘s wrap-up talk.)

There was a decided bent towards pragmatism in the presentations, underscored by Knight president Alberto Ibargüen‘s measured, even questioning introduction to the News Challenge winners. “I ask myself what we have actually achieved,” he said of the previous cycles of the News Challenge. “And I ask myself how we can take this forward.”

While the big news was the announcement of this year’s winners and the fate of the program going forward, there were plenty of discussions and presentations that caught our attention.

Panelists and speakers — from Republican Congressman Darrell Issa and WNYC’s John Keefe to Columbia’s Emily Bell and recent MIT grads — offered insights on engagement (both online and off), data structure and visualization, communicating with government, the role of editors, and more. In the words of The Boston Globe’s Adrienne Debigare, “We may not be able to predict the future, but at least we can show up for the present.”

One more News Challenge

Though Ibargüen spoke about the future of the News Challenge in uncertain terms, Knight hasn’t put the competition on the shelf quite yet. Maness announced that there would indeed one more round of the challenge this fall with a focus on health. That’s about all the we know about the next challenge; Maness said Knight is still in the planning stages of the cycle and whatever will follow it. Maness said they want the challenge to address questions about tools, data, and technology around health care.

Opening up the newsroom

One of the more lively discussions at the conference focused on how news outlets can identify and harness the experience of outsiders. Jennifer Brandel, senior producer for WBEZ’s Curious City, said one way to “hack” newsrooms was to open them up to stories from freelance writers, but also to more input from the community itself. Brandel said journalists could also look beyond traditional news for inspiration for storytelling, mentioning projects like Zeega and the work of the National Film Board of Canada.

Laura Ramos, vice president of innovation and design for Gannett, said news companies can learn lessons on user design and meeting user needs from companies like Airbnb and Square. Ramos said another lesson to take from tech companies is discovering, and addressing, specific needs of users.

newsroominsidepanel

Bell, director of the Tow Center for Digital Journalism at Columbia University, said one solution for innovation at many companies has been creating research and development departments. But with R&D labs, the challenge is integrating the experiments of the labs, which are often removed from day-to-day activity, to the needs of the newsroom or other departments. Bell said many media companies need leadership that is open to experimentation and can juggle the immediate needs of the business with big-picture planning. Too often in newsrooms, or around the industry, people follow old processes or old ideas and are unable to change, something Bell compared to “watching six-year-olds playing soccer,” with everyone running to the ball rather than performing their role.

Former Knight-Mozilla fellow Dan Schultz said the issue of innovation comes down to how newsrooms allocate their attention and resources. Schultz, who was embedded at The Boston Globe during his fellowship, said newsrooms need to better allocate their developer and coding talent between day-to-day operations like dealing with the CMS and experimenting on tools that could be used in the future. Schultz said he supports the idea of R&D labs because “good technology needs planning,” but the needs of the newsroom don’t always meet with long-range needs on the tech side.

Ramos and Schultz both said one of the biggest threats to change in newsrooms can be those inflexible content management systems. Ramos said the sometimes rigid nature of a CMS can force people to make editorial decisions based on where stories should go, rather than what’s most important to the reader.

Vine, Drunk C-SPAN, and gender bias

!nstant: There was Nieman Foundation/Center for Civic Media crossover at this year’s conference: 2013 Nieman Fellows Borja Echevarría de la Gándara, Alex Garcia, Paula Molina, and Ludovic Blecher presented a proposal for a breaking news app called !nstant. The fellows created a wireframe of the app after taking Ethan Zuckerman’s News and Participatory Media class.

The app, which would combine elements of liveblogging and aggregation around breaking news events, was inspired by the coverage of the Boston marathon bombing and manhunt. The app would pull news and other information from a variety of sources, “the best from participatory media and traditional journalism,” Molina said. Rather than being a simple aggregator, !nstant would use a team of editors to curate information and add context to current stories when needed. “The legacy media we come from is not yet good at organizing the news in a social environment,” said Echevarría de la Gándara.

Drunk C-SPAN and Opened Captions: Schultz also presented a project — or really, an idea — that seems especially timely when more Americans than usual are glued to news coming out of the capitol. When Schultz was at the Globe, he realized it would be both valuable and simple to create an API that pulls closed captioning text from C-SPAN’s video files, a project he called Opened Captions, which we wrote about in December. “I wanted to create a service people could subscribe to whenever certain words were spoken on C-SPAN,” said Schultz. “But the whole point is [the browser] doesn’t know when to ask the questions. Luckily, there’s a good technology out there called WebSocket that most browsers support that allows the server and the browser to talk to each other.”

To draw attention to the possibilities of this technology, Schultz began experimenting with a project called Drunk C-SPAN, in which he aimed to track key terms used by candidates in a televised debate. The more the pols repeat themselves, the more bored the audience gets and the “drunker” the program makes the candidates sound.

But while Drunk C-SPAN was topical and funny, Schultz says the tool should be less about what people are watching and more about what they could be watching. (Especially since almost nobody in the gen pop is watching C-SPAN regularly.) Specifically, he envisions a system in which Opened Captions could send you data about what you’re missing on C-SPAN, translate transcripts live, or alert you when issues you’ve indicated an interest in are being discussed. For the nerds in the house, there could even be a badge system based on how much you’ve watched.

Schultz says Opened Captions is fully operational and available on GitHub, and he’s eager to hear any suggestions around scaling it and putting it to work.

followbiasFollow Bias is a Twitter plugin that calculates and visualizes the gender diversity of your Twitter followers. When you sign in to the app, it graphs how many of your followers are male, female, brands, or bots. Created by Nathan Mathias and Sarah Szalavitz of the MIT Media Lab, Follow Bias is built to counteract the pernicious function of social media that allows us to indulge our unconscious biases and pass them along to others, contributing to gender disparity in the media rather than counteracting it.

The app is still in private beta, but a demo, which gives a good summary of gender bias in the media, is online here. “The heroes we share are the heroes we have,” it reads. “Among lives celebrated by mainstream media and sites like Wikipedia, women are a small minority, limiting everyone’s belief in what’s possible.” The Follow Bias server updates every six hours, so the hope is that users will try to correct their biases by broadening the diversity of their Twitter feed. Eventually, Follow Bias will offer metrics, follower recommendations, and will allow users to compare themselves to their friends.

LazyTruth: Last fall, we wrote about Media Lab grad student Matt Stempeck’s LazyTruth, the Gmail extension that helps factcheck emails, particularly chain letters and phishing scams. After launching LazyTruth last fall, Stempeck told the audience at the Civic Media conference that the tool has around 7,000 users. He said the format of LazyTruth may have capped its growth: “We’ve realized the limits of Chrome extensions, and browser extensions in general, in that a lot of people who need this tool are never going to install browser extensions.”

Stempeck and his collaborators have created an email reply service to LazyTruth, that lets users send suspicious messages to ask@lazytruth.com to get an answer. Stempeck said they’ve also expanded their misinformation database with information from Snopes, Hoax-Slayer and Sophos, an antivirus and computer security company.

LazyTruth is now also open source, with the code available on GitHub. Stempeck said he hopes to find funding to expand the fact-checking into social media platforms.

Vine Toolkit: Recent MIT graduate Joanna Kao is working on a set of tools that would allow journalists or anyone else to use Vine in storytelling. The Vine Toolkit would provide several options to add context around the six-second video clips.

Kao said Vines offer several strengths and weaknesses for journalists: the short length, ease of use, and the built-in social distribution network around the videos. But the length is also problematic, she said, because it doesn’t provide context for readers. (Instagram’s moving in on this turf.) One part of the Vine Toolkit, Vineyard, would let users string together several vines that could be captioned and annotated, Kao said. Another tool, VineChatter, would allow a user to see conversations and other information being shared about specific Vine videos.

Open Space & Place: Of algorithms and sensor journalism

WNYC: We also heard from WNYC’s John Keefe during the Open Space & Place discussion. Keefe shared the work WNYC did around tracking Hurricane Sandy, and, of course, the Lab’s beloved Cicada Project. (Here’s our most recent check-in on that invasion topic.)

keefecicadas

As Keefe has told the Lab in the past, the next big step in data journalism will be figuring out what kind of stories can come out of asking questions of data. To demonstrate that idea, Keefe said WNYC is working on a new project measuring air quality in New York City by strapping sensors to bikers. This summer, they’ll be collaborating with the Mailman School of Public Health to do measurement runs across New York. Keefe said the goal would be to fill in gaps in government data supplied by particulate measurement stations in Brooklyn and the Bronx. WNYC is also interested in filling in data gaps around NYC’s housing authority, says Keefe. After Hurricane Sandy, some families living in public housing went weeks without power and longer without heat or hot water. Asked Keefe: “How can we use sensors or texting platforms to help these people inform us about what government is or isn’t doing in these buildings?”

With the next round of the Knight News Challenge focusing on health, keep on eye on these data-centric, sensor-driven, public health projects, because they’re likely to be going places.

Mapping the Globe: Another way to visualize the news, Mapping the Globe lets you see geographic patterns in coverage by mapping The Boston Globe’s stories. The project’s creator, Lab researcher Catherine D’Ignazio, used the geo-tagged locations already attached to more than 20,000 articles published since November 2011 to show how many of them relate to specific Boston neighborhoods — and by zooming out, how many stories relate to places across the state and worldwide. Since the map also displays population and income data, it’s one way to see what areas might be undercovered relative to who lives there — a geographical accountability system of sorts.

This post includes good screenshots of the prototype interactive map. The patterns raise lots of questions about why certain areas receive more attention than others: Is the disparity tied to race, poverty, unemployment, the location of Globe readers? But D’Ignazio also points out that there are few conclusive correlations or clear answers to her central question — “When does repeated newsworthiness in a particular place become a systemic bias?”

Older posts are this way If this message doesn't go away, click anywhere on the page to continue loading posts.
Could not load more posts
Maybe Soup is currently being updated? I'll try again automatically in a few seconds...
Just a second, loading more posts...
You've reached the end.

Don't be the product, buy the product!

Schweinderl