Tumblelog by Soup.io
Newer posts are loading.
You are at the newest post.
Click here to check if anything new just came in.

September 28 2010

14:00

June 01 2010

17:00

Aggregators, curators, and indexers: There’s a difference, and it matters

Aggregation. Curation. Indexing. They’re all the same, aren’t they? Ask any serious online journalist or new media entrepreneur, and the answer will be quick and obvious: of course not! But in the public debate over the future of journalism — especially the debate as framed by legal analysts and public officials — the words often get thrown around as if they are identical. Ordinarily, such word quibbling would seem a little sad. But in the current context, where every aspect of journalism is up for grabs and concepts like “the hot news doctrine” are discussed in serious tones, words and definitions mean a great deal. So I thought it might be worth a little time thinking about what we mean by aggregation, by curation, and by indexing. In other words: if you’re an “aggregator,” what is it, exactly, that you do?

To get a sense of how I thought these terms were being increasingly lumped together, and some of the problems this might cause, I wanted to highlight the first couple paragraphs from the written materials distributed at the Online Media Legal Network’sJournalism’s Digital Transition,” which was a conference I attended at Harvard a few weeks ago. The conference, by the way, was great, and I don’t mean to pick on the OLMN. But I did think that the discussion of aggregation included in their CLE (Continuing Legal Education) materials really summed up the issues that I wanted to get at in this post. In the document “News Aggregation and Copyright Fair Use,” conference attendees read:

One of the hottest topics in copyright law these days is the rise of the news aggregator, from Google News to the Huffington Post … debate arises when third-parties get into the act [of] reselling and profiting from information generated by traditional media organizations.

Of course, building a business model around monetizing another’s website content isn’t novel, and methods for doing so have been around for almost as long as the Internet has been considered a viable commercial entity. Consider the practice of framing, or superimposing ads, onto linked websites … News aggregators, which take information from multiple websites and display it on a single page, providing a convenient one-stop resource for readers, are merely the latest flavor-of-the-week.

Though Google News may be the most well known commercial news aggregator, there are many others, such as the Huffington Post and Newser.com. Some use only headlines and links, others copy full (or nearly full) articles and photos. Nearly all receive ad revenue, many based on page views that, copyright owners allege, are being diverted from websites that originate the content.

Are Google News, Huffington Post, and Newser.com the same? How about the other online organizations traditionally tossed into the mix, such as Gawker? If you view the online news ecosystem as basically bifurcated into two categories — content originators and content reusers — than this view of the world might make sense. In the above model, the primary issue isn’t what these sites actually do all day, but the fact that they “receive ad revenue, many based on page views that, copyright owners allege, are being diverted from websites that originate the content.” And yet, as soon as you start to conceptually differentiate between Google News and the Huffington Post, it becomes clear that there’s a much more complex news ecosystem out there.

So what’s actually going on online? I thought it might be interesting to take one of our very own Lab posts, Mark Coddington’s all around smashing This Week in Review, and parse out how the ways that Mark engages in both what I’d call “aggregation” and “curation.” In essence, I think the upper sections of This Week in Review are fundamentally different from the bottom, concluding section, and the differences between the two sections point to different ways of doing online newswork.

The first dozen paragraphs of TWIR are usually broken down into three or four “hot topics” that are big in the future of journalism world that week. As Mark told me when I emailed him and asked him to explain his thinking behind This Week in Review, the upper sections

explore a discussion — a news development with commentary surrounding it, or ideas that spark responses and thus launch (or, usually, continue) a conversation. With those sections, I see myself as mapping out a discussion — explaining who’s on what side, what each person is saying and where that places them in relation to everyone else…If I see some substantive discourse coalescing around an article, that’s more likely to merit its own section because there are several connections I feel I need to explain (i.e. Person A said this, Person B responded with this, and Person C and D reminded both A and B of this and this).

Let’s take one recent TWIR as an example. The hot topics picked by Mark involved (1) the continuing controversy over Facebook, (2) a discussion of iPad apps, (3) New York Times and Wall Street Journal paywalls, and (4) finally, a good overview of recent pieces on new digital news experiments. I’d call this first, lengthiest section of the Week in Review “content aggregation and analysis.” In the old days I would have just called it “blogging.”

  • The topics Mark discusses in This Week in Review emerge from a deep immersion in the conversation about the future of journalism, and a lengthy period of active listening to what people are saying. I follow future-of-journalism news pretty closely, and I’ve almost never disagreed with Mark’s analysis about what the important topics of the week are. In short, I trust his judgment. But it’s a judgment that stems from deep, active engagement in the topic at hand.
  • The way Mark highlights the contours of the debate is through linking back to his original sources. The discussion of Facebook contains 17 links in four paragraphs.
  • Mark occasionally (but not often) weighs in on one of the debates, but he does it pretty subtly, and the bulk of This Week in Review is definitely taken up with summarizing and translating what others are saying.

The second part of TWIR — and it’s usually just a few paragraphs — is called “Reading Roundup.” I’d call this part of This Week in Review “curation,” and it strikes me as pretty different from the rest of the piece. It’s not as centered around debates, and the links tend to go to online content which is more “think-piecey.” In this section, Mark seems to be listening a little bit less, and exercising a bit more personal judgment. I hear him telling me: “Hey! You’ve followed the piece to the end, which tells me you really care about this issue. Since I think we share similar interests, you might like these pieces too!” Or as Mark put it when I quizzed him about the difference:

You’re right — there is a difference between the “reading roundup” and the rest of the weekly review posts…with the reading roundups, I’m merely pointing the reader toward an interesting link without substantively explaining its connection to the rest of the journalism-in-transition world. Essentially, the reading roundup is like me inviting you to a party, while the main sections are like me walking you through a room at that party, introducing you to people, explaining who’s who, and giving you a sense of who you might enjoy talking to.

Finally, compare both of these forms of writing to something like Google News, which uses complex algorithms to determine what the hot topics of the minute are, what counts as a spotlight story, and how to rank stories in order of originality and importance. If Google News looks like anything, it’s a phone book — or one of those yearly news indexes in the big green binders you used to encounter in libraries, just more up to date. There isn’t the same sense of “listening,” the process of judgment seems different, and most importantly, there isn’t the same kind of interstitial commentary surrounding the links. For me, what Google News and other sites do might productively be called “indexing.”

Because this blog post is already over 1,300 words, I’m not going to get into the question posed by Ken Doctor: Can’t we just call all this stuff “content arbitrage“? Maybe that’s the subject for another post, but the short answer is I don’t think you can. I think we need to begin to compare the new forms of journalistic work that exist online, not just to some imaginary ideal of “content creation” versus an evil “repurposing,” but to each other.

Ultimately, why does all this matter? Is there an ultimate upshot of all this linguistic parsing?

For me, the lesson is simple. Anytime you hear someone talk about Google News, The Huffington Post, Gawker, blogging, aggregating, curation, and indexing as if they are the same phenomenon, ignore them. And if they attach that discussion to a set of policy recommendations, without acknowledging the full complexity of what it is people actually do when they aggregate, curate, and index information — well, then you should put your fingers in your ears and run in the other direction.

November 19 2009

14:00

Need a lawyer? New network gives web publishers a line of defense

If you’ve gone the entrepreneurial route you know that first flush of enthusiasm often dampens when nitty-gritty decisions need to be made. There’s accounting, taxes, incorporation, insurance — and that’s the clear stuff. Toss in murky issues around trademark and branding and it’s easy to see how dreams of independence get squelched.

The Citizen Media Law Project at Harvard’s Berkman Center doesn’t want those entrepreneurial instincts to wither on the vine. It’s just launched an ambitious collection of free legal resources called the Online Media Legal Network (OMLN), the centerpiece of which is a matchmaking service that connects online publishers with attorneys who can address their specific needs. It’s a full-service effort, covering everything from basic business structure to contracts to representation in court.

OMLN is open to any online publisher that meets the network’s requirements. Organizations must be independent, journalism-minded, and have an eye toward sustainability either as for-profit businesses or nonprofits. If that describes your outfit, you can start the application process here.

The really good news is that pro bono assistance is available and the thresholds are generous. For-profit organizations that make less than $100,000 gross annual revenue qualify, as do nonprofits with operating budgets under $250,000. The high ceiling should cover the growing legion of bootstrapped web publishers.

“As long as their work is in the public interest, as long as it involves adherence to journalistic standards, then they’re going to be able to get help through the network until they’ve grown to the point where they are no longer entitled to free services,” said our friend David Ardia, the Project’s director.

Deeper-pocketed clients who don’t fall within the pro bono requirements are encouraged to apply, for free, as well. They’ll just have to arrange payment terms with a matched attorney.

More than a directory

Machine intelligence and algorithms can’t encompass all the variations in client needs and attorney specialties. That’s why four OMLN lawyers drive the process through extensive client screenings. These screenings need to capture a lot of nuance because applicants aren’t judged against any quantitative criteria, like page views or posting frequency.

Here’s how the matching process works: A lawyer in the network logs in to the site and is presented with client requests matching the lawyer’s pre-defined criteria (”nonprofits in California” or “clients who want to incorporate,” that sort of thing). Client names are not revealed at this point. The lawyer selects a specific request, and an OMLN staffer determines if the pairing is a good fit. If it is, the lawyer receives detailed information so he/she can check for conflicts with existing clients. The lawyer and the new OMLN client then get in touch directly and OMLN fades into the background. Either side can opt out if the match doesn’t feel right. Once the client’s legal issue is resolved, OMLN gathers feedback through private surveys with both parties.

OMLN needs to maintain balance if it’s going to be useful, Ardia said. Too many clients and online publishers won’t receive timely help. Too many lawyers and frustration mounts over lack of opportunities. Equilibrium is struck through a “slow as you go” approach that was honed while the site was being built. OMLN’s initial batch of clients was limited to past winners of the Knight News Challenge, and lawyers were invited to join based on their skill sets. Some amount of calibration will continue now that site is officially open, with the aim of matching clients and lawyers within three to four weeks of a request for assistance. That’s pretty quick considering the effort and issues at play.

OMLN itself is a 2007 News Challenge winner. It used an initial $250,000 grant to get the ball rolling, and it’s now running on two subsequent years of Knight funding. The goal is to make OMLN sustainable by the time funding runs out next October. Ardia hopes that since OLMN doesn’t bring in any money through the service, law firms and others will donate to support its continued operation.

Older posts are this way If this message doesn't go away, click anywhere on the page to continue loading posts.
Could not load more posts
Maybe Soup is currently being updated? I'll try again automatically in a few seconds...
Just a second, loading more posts...
You've reached the end.

Don't be the product, buy the product!

Schweinderl