Tumblelog by Soup.io
Newer posts are loading.
You are at the newest post.
Click here to check if anything new just came in.

July 01 2013


Monday Q&A: Denise Malan on the new data-driven collaboration between INN and IRE

Every news organization wishes it could have more reporters with data skills on staff. But not every news organization can afford to make data a priority — and even those that do can sometimes find the right candidates hard to find.

A new collaboration between two journalism nonprofits — the Investigative News Network and Investigative Reporters and Editors — aims to address this allocation issue. Denise Malan, formerly a investigative and data reporter at the Corpus Christi Caller-Times, will fill the new role of INN director of data services, offering “dedicated data-analysis services to INN’s membership of more than 80 nonprofit investigative news organizations,” many of them three- or four-person teams that can’t find room or funding for a dedicated data reporter.

It’s a development that could both strengthen the investigative work being done by these institutions and skill building around data analysis in journalism. Malan has experience in training journalists in skills of procuring, cleaning, and analyzing data, and she has high hopes for the kinds of stories and networked reporting that will be produced by this collaboration. We talked about IRE’s underutilized data library, potentially disruptive Supreme Court decisions around freedom of information, the unfortunate end for wildlife wandering onto airplane runways, and what it means to translate numbers into stories.

O’Donovan: How does someone end up majoring in physics and journalism?
Malan: My freshman year they started a program to do a bachelor of arts in physics. Physics Lite. And you could pair that with business or journalism or English — something that was really your major focus of study, but the B.A. in physics would give you a good science background. So you take physics, you take calculus, you take statistics, and that really gives you the good critical thinking and data background to pair with something else — in my case, journalism.
O’Donovan: I guess it’s kind of easy to see how that led into what you’re doing now. But did you always see them going hand in hand? Or is that something that came later?
Malan: In college, I thought I was going to be a science writer. That was the main reason I paired those. When I got into news and started going down the path of data journalism, I was very glad to have that background, for sure. But I started getting more into the data journalism world when the Caller-Times in Corpus Christi sent me to the IRE bootcamp, where it’s a weeklong, intensive week where you concentrate on learning Excel and Access and the different pitfalls you can face in data — some basic cleaning skils. That’s really what got me started in the data journalism realm. And then the newspaper continued to send me to training — to the CAR conferences every year and local community college classes to beef up my skills.
O’Donovan: So, how long were you at the Caller-Times?
Malan: I was there seven years. I started as a reporter in June 2006, and then moved up into editing in May of 2010.
O’Donovan: And in the time that you were there as their data person, what are some stories that you were particularly proud of, or made you feel like this was a a burgeoning field?
Malan: We focused on intensely local projects at the Caller-Times. One of the ones that I was really proud of I worked on with our city hall reporter Jessica Savage. She found out that the city streets are a huge issue in Corpus Christi. If you’ve ever driven here, you know they are just horrible — a disaster. And the city is trying to find a billion dollars to fix them.

So our city hall reporter found out that the city keeps a database of scores called the Pavement Condition Index. Basically, it’s the condition of your street. So we got that database and we merged it with a file of streets and color-coded it so people could fully see what the condition of their street was, and we put it a database for people to find their exact block. This was something the city did not want to give us at first, because if people know the condition of their street scores, they’re going to demand that we do something about it. We’re like, “Yeah, that’s kind of the idea.” But that database became the basis for an entire special section on our streets. We used it to find people on streets who scored a 0, and talked about how it effects their life — how often they have to repair their cars, how often they walk through giant puddles.

And then we paired it with a breakout box of every city council member and their score. We did a map online, which, for over a year, actually, has been a big hit while the city is discussing how they’re going to find this money. People have been using it as a basis for the debate that they’re having, which, to me, is really kind of how we make a difference. Using this data that the city had, bringing it to light, making it accessible, I think, has really just changed the debate here for people. So that’s one thing I’m really proud of — that we can give people information to make informed decisions.

O’Donovan: Part of your new position is going to be facilitating and assisting other journalists in starting to understand how to do this kind of work. How do you tell reporters that this isn’t scary — that it’s something they can do or they can learn? How do you begin that conversation?
Malan: [At the Caller-Times] we adopted the philosophy that data journalism isn’t just something that one nerdy person in the office does, but something that everyone in the newsroom should have in their toolbox. It really enhances eery beat at the newspaper.

I would do training sessions occasionally on Excel, Google Fusion Tables, Caspio to show everyone in the newsroom, “Here’s what’s possible.” Some people really pick up on it and take it and run with it. Some people are not as math oriented and are not going to be able to take it and run with it themselves, but at least they know those tools are available and what it’s possible to do with them.

So some of the reporters would be just aware of how we could analyze data and they would keep their eyes open for databases on their beats, and other reporters would run with it. That philosophy is very important in any newsroom today. A lot of what I’m going to be doing with IRE and INN is working with the INN members in helping them to gather the data and analyze it and inform their local reporting. So a lot of the same roles, but in a broader context.

O’Donovan: So a lot of it is understanding that everyone is going to come at it with a different skill level.
Malan: Yes, absolutely. All our members have different levels of skills. Some of our members have very highly skilled data teams, like ProPublica, Center for Public Integrity — they’re really at the forefront of data journalism. Other members are maybe one- or two-person newsrooms that may not have the training and don’t have any reporters with those skills. So the skill sets are all over the board. But it will be my job to help, especially smaller newsrooms, plug into those resources — especially the resources at IRE — the best they can, with the data library there and the training available there. We help them bring up their own skills and enhance their own reporting.
O’Donovan: When a reporter comes to you and says, “I just found this dataset or I just got access to it” — how do you dive into that information when it comes to looking for stories? How do you take all of that and start to look for what could turn into something interesting?
Malan: A lot of it depends on the data set. Just approach every set of data as a source that you’re interviewing. What is available there? What is maybe missing from the data is something you want to think about too? And you definitely want to narrow it down: A lot of data sets are huge, especially these federal data sets that might have records containing, I don’t know, 120 fields, but maybe you’re only interested in three of them. So you want to get to know the data set, and what is interesting in it, and you want to really narrow your focus.

One collaboration that INN did was using data gathered by NASA for the FAA, and it was essentially near misses — incidents at airports like hitting deer on the runway, and all these little things that can happen but aren’t necessarily reported. They all get compiled in this database, and pilots write these narratives about it, so that field is very interesting to them. There were four or five INN members who collaborated on that, and they all came away with different stories because they all found something else that was interesting for them locally.

O’Donovan: This position you’ll hold is about bringing the work of INN and IRE together. What’s that going to look like? We talk all the time about how journalism is moving in a more networked direction — where do you see this fitting into that?
Malan: IRE and INN have always had a very close relationship, and I think that this position just kind of formalizes that. I will be helping INN members plug into the resources of IRE, especially the data library, I’ll be working closely with Liz Lucas, the database director at IRE, and I’m actually going to be living near IRE so I can work more closely with them. Some of that data there is very underutilized and it’s really interesting and maybe hasn’t been used in any projects, especially on a national level.

So we can take that data and I can kind of help analyze it, help slice it for the various regions we might be looking at, and help the INN members use that data for their stories. I’ll basically be acting as almost a translator to get this data from the IRE and help the INN members use it.

Going the other way, with INN members, they might come up with some project idea where data isn’t available from the database library, or it might be something where we have to gather data from every state individually, so we might compile that and whatever we end up with will be sent back to the IRE library and made available to other IRE members. So it’s a two-way relationship.

O’Donovan: So in terms of managing this collaboration, what are the challenges? Are you think of building an interface for sharing data or documents?
Malan: We’re going to be setting up a kind of committee of data people with INN to have probably monthly calls and just discuss ideas, what they’re working on, brainstorming, possible ideas. I want it to be a very organic, ground-up process — I don’t want it to be dictating what the projects should be. I want the members to come up with their own ideas. So we’ll be brainstorming and coming up with things, and we’ll be managing the group through Basecamp and communicating that way. A lot of the other members are already on Basecamp and communicate that way through INN.

We’ll be communicating through this committee and coming up with ideas and I’l be working with other members to, to reach out to them. If we come up with an idea that deals with health care, for example, I might reach out to some of the members that are especially focused on health care and try to bring in other members on it.

O’Donovan: Do you foresee collaborations between members, like shared reporting and that kind of thing?
Malan: Yeah, depending on the project. Some of it might be shared reporting; some of it might be someone does a main interview. If we’re doing a crime story dealing with the FBI’s Uniformed Crime Report, maybe we just have one reporter from every property, we nominate one person to do the interview with the FBI that everyone can use in their own story, which they localize with their own data. So, yeah, depending on the project, we’ll have to kind of see how the reporting would shake out.
O’Donovan: Do you have any specific goals or types of stories you want to tell, or even just specific data sets you’re eager to get a look at?
Malan: I think there are several interesting sets in the IRE data library that we might go after at first. There’s really interesting health sets, for example, from the FDA — one of them is a database of adverse affects from drugs, complaints that people make that drugs have had adverse effects. So yeah, some of those can be right off the bat, ready to go and parse and analyze.

Some other data sets we might be looking at will be a little harder to get, will take some FOIs and some time to get. There are several major areas that our members focus on and that we’ll be looking at projects for. Environment, for example — fracking is a large issue, and how environment effects public health. Health care, especially with the Affordable Care Act coming into effect next year is going to be a large one. Politics, government, how money effects influences politicians is a huge area as we come up on the 2016 elections and the 2014 midterms. And education is another issue with achievement gaps, graduation rates, charter schools — those are all large issues that our members follow. Finding those commonalties and dealing with data sets, digging into that is going to be my first priority.

O’Donovan: The health question is interesting. Knight announced its next round of News Challenge grants is going to be all around health.
Malan: I’m excited about that. We have several members that are really specifically focused on healt,h so I feel like we might be able to get something good with that.
O’Donovan: Health care stuff or more public health stuff?
Malan: It’s a mix, but a lot of stuff is geared toward the Affordable Care Act now.
O’Donovan: Gathering these data sets must often involve a lot of coordination across states and jurisdictions.
Malan: Yeah, absolutely. One thing I am a little nervous about is the Supreme Court’s recent ruling in the Virginia case where they can now require you to live in a state to put in an FOI. That might complicate things a little bit. I know there are several groups working on lists of people who will put an FOI in for you in various states. But that can kind of just slow down the process and put a little kink in and add to the timeline. I’m concerned of course that now they know it’s been ruled constitutional that every state might make that the law. It could be a huge thing. A management nightmare.
O’Donovan: What kind of advice do you normally give to reporters who are struggling to get information that they know they should be allowed to have?
Malan: That’s something we encountered a lot here, especially getting data in the proper format, too. Laws on that can vary from state to state. A lot of governments will give you paper or PDF format, instead of the Excel or text file that you asked for. It’s always a struggle.

The advice is to know the law as best you can, know what exceptions are allowed under your state law, be able to quote — you don’t have to have the law memorized, but be able to quote specific sections that you know are on your side. Be prepared with your requests, and be prepared to fight for it. And in a lot of cases, it is a fight.

O’Donovan: That’s an interesting intersection of technical and legal skill. That’s a lot of education dollars right there.
Malan: Yeah, no kidding.
O’Donovan: When you do things like attend the NICAR conference and assess the scene more broadly, where do you see the most urgent gaps in the data journalism field? Is it that we need more data analysts? More computer scientists? More reporters with the fluency in communicating with government? More legal aid? If you could allocate more resources, where would you put them right now?
Malan: There’s always going to be a need for more very highly skilled data journalists who can gather these national sets, analyze them, clean them, get them into a digestible format, visualize them online, and inform readers. I would like to see more general beat reporters interested in data and at least getting skills in Excel and even Access — because the beat reporters are the ones on the ground, using their sources, finding these data sets or not finding them if they’re not aware of what data is. I would really like this to be a bigger push to at least educate most general beat reporters to a certain level.
O’Donovan: Where do you see the data journalism movement headed over the next couple years? What would your next big hope for the field be?
Malan: Well, of course I hope for it to go kind of mainstream, and that all reporters will have some sort of data skills. It’s of course harder with fewer and fewer resources, and reporters are learning how to tweet and Instagram, and there are demands on their time that have never been there.

But I would hope it would become just an normal part of journalism, that there would be no more “data journalism” — that it just becomes part of what we do, because it’s invaluable to reporting and to really helping ferret out the truth and to give context to stories.

April 27 2012


At the International Journalism Festival: Can Data Journalism Save Newsrooms?

PERUGIA, Italy -- Here at the International Journalism Festival the launch of three large initiatives have generated a lot of the buzz around data journalism.

The School of Data Journalism, organized by the European Journalism Centre and the Open Knowledge Foundation, is composed of three panels and five workshops and dives into some of the key issues that media organizations are currently considering: "Is it worth my while starting out trying to do data journalism?", "Will data journalism make us money?", "How do you get data that you can search, filter and analyze with a computer?" and "How do I make data stories sexy?"


In addition, the 58 nominations for the Data Journalism Awards (DJA) were announced. DJA is the first international competition that recognizes and showcases the great work done in data journalism. Prizes are awarded for data-driven applications, investigations, and storytelling through visualizations. It's hoped that these awards will encourage more news organizations to embark on more ambitious data projects and alleviate the "loneliness in the newsroom" which some data journalists experience when their colleagues don't understand what they do. The six winners will be announced May 31.

And on Saturday, the Data Journalism Handbook will be launched. The handbook was born at the Mozilla Festival in November. It's a collection of tips, anecdotes and case studies from more than 70 leading data journalists and data wranglers, including contributions from The New York Times, Zeit Online, the BBC, the Guardian and many more. The book will be an open educational resource with key lessons a beginner data journalist should know. You can see a chapter overview of the handbook here and an excerpt from the first chapter here. A free version will be available online at datajournalismhandbook.org, and an e-book and print version will soon be published by O'Reilly Media.

So what is data journalism?

The School of Data Journalism, a series of panel discussions and workshops at the festival, was led by leading practitioners from all over the world and aimed to show participants what data journalists can do and why they should take the plunge and learn new skills.

The definition of data journalism varies depending on whom you ask. For some journalists, it's simply the courage to tackle sometimes huge and messy datasets. For others, it's being transparent and open about "showing the working" behind their conclusions, backing up their stories with facts and numbers where one might previously have only evidenced their point with "he said/she said." For others, it's a new way of presenting data through visualizations and interactive news applications; news is no longer simply static words on a page.

Increasingly, though, many are coming to realize that data journalism is a set of skills, involving new methods for acquiring, analyzing and working with data which simply weren't computationally feasible before. In an age that is positively drowning in data, we need more data journalists who typically have better storytelling skills than statisticians and can act as translators of complex datasets for the benefit of the public.

As activist and author Heather Brooke put it in the "Information wants to be free" workshop, data journalism is a misnomer -- one doesn't say "telephone journalism" if you contact your sources via telephone; journalists have to use data to do their job well.

Guerrilla Tactics: how to get started with Data Journalism

In the first panel of the school, "From Computer Assisted Reporting to Data Journalism," Pulitzer Prize winners Sarah Cohen and Steve Doig, highlighted their experiences working in the United States, where the notion of Computer Assisted Reporting (CAR) has been around for several decades -- far longer than the budding data journalism scene here in Europe.

They described their experiences learning how to use tools and techniques -- unfamiliar to journalists but popular in other disciplines such as social science and history -- to stay at the cutting edge of journalism. They also described the "guerrilla tactics" they initially had to use to get their work into print. "If you produce an amazing visualization, your editor is going to find a way to get it published," Cohen said, adding that it's far easier to show someone what data journalism is than to explain what it is.


Next up, Aron Pilhofer described his journey to data journalism at the New York Times. He said it came from a feeling of frustration with the inefficiency of working practices and tools. This sentiment resonated strongly with the other panelists -- a common complaint concerned individual journalists holding onto their data, producing datasets that only they could understand, instead of resources that could be built on and expanded by others on their teams.

On the same panel, Elisabetta Tola of formicablu and Simon Rogers of the Guardian gave a European perspective on data journalism. Rogers demonstrated how the Guardian Datablog's interactive maps of the U.K. riots helped disband false statements by the government that the "riots were not about poverty." Tola then explained some of the more basic problems facing wannabe data journalists in Italy, some of whom would be lucky to get data even on paper, as it's common for officials to simply dictate the numbers to journalists.

The second panel, "How can data journalism save your newsroom?", examined perspectives and business models for data journalism, and attempted to answer the question: "Is it worth it?"

Caelainn Barr of Citywire urged journalists not to consider data journalism as a fix-all that will save anything. She warned that editors are unlikely to be considerate and give you more time just because you're using complex data or working hard to present it better. Barr said journalists are constantly playing a game of catchup; advertisers are moving elsewhere; and journalists have less time to produce their stories and are struggling to keep up. All of this means journalists have to be more agile and learn to do things more efficiently.

To solve this problem, Pilhofer said, the New York Times has built resources that live on for future stories, allowing both journalists and the interactive news team to spring into action as soon as a related story breaks.

"What is the simplest thing you can do to start with data journalism?" ProPublica's Dan Nguyen asked rhetorically. "Keep your notes in a spreadsheet." He said often, the skills required to find stories involve sorting, grouping and averaging the data. With skills this simple, can newsrooms really afford not to teach them to their journalists?

The Future of Journalism is Bold

What does the future look like for data journalism? "Data journalism is just becoming journalism," said the Guardian's Rogers -- which was possibly the most encouraging statement from any of the panelists here.

Data journalism is no longer limited to only those who can afford to pay $900 for a piece of visualization software. Now incredibly powerful, open-source solutions are available. Organizations such as ProPublica encourage others to use their approaches in other stories to bring data journalism to local levels.

However, a change in culture will be needed to get more journalists into the fold. As Tola explained, collaboration is key, both journalist-journalist and journalist-coder collaboration. As Wired Italy's Guido Romeo put it, "Journalism is a one-man band. Data journalism is clearly not."

As technology develops, the ways of presenting this information just become more exciting. Could Italy be a land of opportunity for data journalism? The enthusiasm with which the workshops were met gave the impression that he who dares first will have a serious competitive advantage.

The workshops will continue over the next couple of days, and many have spaces open. Any budding data journalists? Join us! The Data Journalism Handbook will be launched at 6:30 p.m. PDT on April 28.

Older posts are this way If this message doesn't go away, click anywhere on the page to continue loading posts.
Could not load more posts
Maybe Soup is currently being updated? I'll try again automatically in a few seconds...
Just a second, loading more posts...
You've reached the end.

Don't be the product, buy the product!