Archive for the 'newsfeed' Category

The Great Unbundling: A Reprise

This piece by Nick Carr, the author of the recently popular “Is Google Making Us Stupid?” in the Atlantic, is fantastic.

My summary: A print newspaper or magazine provides an array of content in one bundle. People buy the bundle, and advertisers pay to catch readers’ eyes as they thumb through the pages. But when a publication moves online, the bundle falls apart, and what’s left are just the stories.

This may no longer be revolutionary thought to anyone who knows that google is their new homepage, from which people enter their site laterally through searches. But that doesn’t mean it’s not the new gospel for digital content.

There’s only one problem with Carr’s argument, though. By focusing on the economics of production, I don’t think its observation of unbundling goes far enough. Looked at another way—from the economics of consumption and attention—not even stories are left. In actuality, there are just keywords entered into google searches. That’s increasingly how people find content, and in an age of abundance of content, finding it is what matters.

That’s where our under-wraps project comes into play. We formalize the notion of people finding content through simple abstractions of it. Fundamentally, from the user’s perspective, the value proposition lies with the keywords, or the persons of interest, not the piece of content, which is now largely commodified.

That’s why we think it’s a pretty big idea to shift the information architecture of the news away from focusing on documents and headlines and toward focusing on the newsmakers and tags. (What’s a newsmaker? A person, corporation, government body, etc. What’s a tag? A topic, a location, a brand, etc.)

The kicker is that, once content is distilled into a simpler information architecture like ours, we can do much more exciting things with it. We can extract much more interesting information from it, make much more valuable conclusions about it, and ultimately build a much more naturally social platform.

People will no longer have to manage their intake of news. Our web application will filter the flow of information based on their interests and the interests of their friends and trusted experts, allowing them to allocate their scarce attention most efficiently.

It comes down to this: Aggregating documents gets you something like Digg or Google News—great for attracting passive users who want to be spoon fed what’s important. But few users show up at Digg with a predetermined interest, and that predetermined interest is how google monetized search ads over display ads to bring yahoo to its knees. Aggregating documents make sense in a document-scarce world; aggregating the metadata of those documents makes sense in an attention-scarce world. When it comes to the news, newsmakers and tags comprise the crucially relevant metadata, which can be rendered in a rich, intuitive visualization.

Which isn’t to say that passive users who crave spoon-fed documents aren’t valuable. We can monetize those users too—by aggregating the interests of our active users and reverse-mapping them, so to speak, back onto a massive set of documents in order to find the most popular ones.

Spiffy Concept

Caveat user: RSS lava lamps

So be good at long-term trends, not just short-term ones. And situate your visualization in the user’s context—different users see different visualizations depending on their differences. Also, make it easy, not difficult, to combine different data sources. Finally, make them actually social and easy to share.

Wow, sounds hard.

Programmable Information

From Tim O’Reilly:

But professional publishers definitely have an incentive to add semantics if their ultimate consumer is not just reading what they produce, but processing it in increasingly sophisticated ways.

In the past and present days of the web and media, publishers competed on price. If your newspaper or book or cd was the cheapest, that was a reason for someone to buy it. As information becomes digital, and the friction of exchange wears away, information will tend to be free. (See here, here, and here—and about a million other places.) That makes competing on price pretty tough.

Of course, publishers also competed, and still do, on quality. As they should. I suspect that readers will never stop wanting their newspapers articles well sourced, well argued, and well written. Partisan readers will never stop wanting their news to make the good guys look good and the bad guys look bad. That’s all in the data.

The nature of digital information, however, changes the what information consumers will find high-quality. Now readers want much more: they want metadata. That’s what O’Reilly’s talking about. That’s what Reuters was thinking when it acquired ClearForest.

Readers won’t necessarily look at all the metadata the way they theoretically read an entire article. Instead readers might find the article because of its metadata, e.g., its issues, characters, organizations, or the neighborhood it was written about. Or they might find another article because it shares a given metadatum or because its set of metadata is similar. Or, another step out, they might find another reader who’s enjoyed lots of similar articles.

The point is that, if your newspaper has metadata that I can use, that is a reason for someone to buy (or look at the ad next to it).

Actually, it’s not that simple. The New York Times annotates its articles with a few tags hidden in the html, and almost no one pays any attention to those tags. Few would even if the tags were surfaced on the page. Blogs have had tags for years, and no one’s really using that metadata, however meager, to great effect.

When blogs do have systematic tags, the way I take advantage of them is by way of an unrelated web application, namely, Google Reader. I can, for instance, subscribe to the RSS feed on this page, which aggregates all the posts tagged “Semantic Web” across ZD Net’s family of blogs. Without RSS and Google Reader, the tags just aren’t that useful. The metadata tells me something, but RSS and a feed reader allow me to lump and split accordingly.

Google Reader allows consumers to process ZDNet’s metadata in “sophisticated ways.” Consumers can’t do it alone, and there’s real opportunity in building the tools to process the metadata.

Without the tools to process the metadata, the added information isn’t terribly useful. That’s why it’s big deal that Reuters has faith that, if it brings forth the metadata, someone will build an application that exploits them—or that slices and dices interestingly.

In fact, ClearForest already tried to entice developers with a contest in 2006. The winner was a web application called Optevi News Tracker, which isn’t very exciting to me for a number of reasons. Among them is that I don’t think it’s a good tool for exploiting metadata. I just don’t really get much more out the news, although that might change if it used more than MSNBC’s feed of news.

My gut tells me that what lies at the heart of News Tracker’s lackluster operation is that it just doesn’t do enough with its metadata. I can’t really put my finger on it, and I could be wrong. Am I? Or should I trust my gut?

So what is the killer metadata-driven news application going to look like? What metadata are important, and what are not? How do we want to interact with our metadata?


There are more than a few ways to remind yourself to read something or other later.

Browsers have bookmarks. Or you can save something to delicious, perhaps tagged “toread,” like very many people do. You can use this awesome firefox plugin called “Read It Later.”

But I like to do my reading inside Google Reader; others like their reading inside their fave reader.

So what am I to do? My first thought was Yahoo Pipes. It’s a well-known secret that Pipes makes screen-scraping around partial feeds as easy as pie. So I thought I could maybe throw together a mashup of and pipes to get something going.

My idea was to my to-be-read-later pages to delicious with a common tag—the common “toread” maybe. I could then have pipes fetch from delicious the feed based on that tag. The main urls for each delicious post point to the original webpage, and so, with the loop operator, I could locate the feed associated with each of the urls in the delicious feed. Original urls in hand, I was thinking I could have pipes auto-discover the associated feeds and then re-use those urls to locate the post within the feed corresponding to the page to be read later.

Well, I don’t think it can be done so easily. (Please! Someone prove me wrong!)

Meantime, I’ll just use my handy grease monkey plug-in that let’s me “preview” posts inside the google reader wrapper—so that I don’t have to hop from tab to tab like a lost frog.

Meantime, someone should really put together this app. Of course, it would really only work simply with pages that have rss analogues in a feed. But if, through Herculean effort, you found some practicable way to inform me that a given page doesn’t, but you could parse out the junk and serve me only the text, you’d be a real hero. Otherwise, just tell me that the page I’m trying to read later doesn’t have an rss analogue, give me an error message, and I’ll move on…assured in the knowledge that it will soon enough.

Gatherers and Packagers: When Product and Brand Cleave 4 Realz

Jeff Jarvis writes about the coming economics of news:

When the packager takes up and presents the gatherer’s content in whole and monetizes it—mostly with advertising—they share the revenue. When the gatherer just links, the gatherer monetizes the traffic, likely as part of an ad network as well.

I think this is right. In the first case, the content is on the “packager’s” page or in its feed; in the second, the content is on the “gatherer’s” page or in its feed. In both cases, advertising monetized the content (let’s say) and readers or viewers found it by way of the packager’s brand (a coarse but inevitable word).

To me, however, the location of the user’s experience seems unimportant—in fact, the whole point of disaggregating journalism into two functions, imho, is to free up the content from the chains of fixed locations. Jarvis writes, “The packagers’ job would be to find the best news and information for their audience no matter where it comes from.” I agree, but why not let it go anywhere too—anywhere, that is, where the packager can still monetize it? (See Attributor if that sounds crazy.)

Couple this with the idea that rss-like subscriptions are on the move as the mechanism by which we get our content, replacing search in part. (As has been said before, there’s no spam on twitter. Why not? Followers just unsubscribe.) The result is that the packager still maintains his incentive to burnish his reputation and sell his brand. After all, that’s what sploggers are: packagers without consciences who get traffic via search.

So I agree with Jarvis: “reliably bringing you the best package and feed of news that matters to you from the best sources” is how “news brands survive and succeed.” That’s how “the packagers are now motivated to assure that there are good sources.”

Give me tags, Calais!

Who needs to think about buying tags when Reuters and its newly acquired company are giving them away?

The web service is free for commercial and non-commercial use. We’ve sized the initial release to handle millions of requests per day and will scale it as necessary to support our users.

I mean, Jesus, it’s so exciting and scary (!) all at once:

This metadata gives you the ability to build maps (or graphs or networks) linking documents to people to companies to places to products to events to geographies to … whatever. You can use those maps to improve site navigation, provide contextual syndication, tag and organize your content, create structured folksonomies, filter and de-duplicate news feeds or analyze content to see if it contains what you care about. And, you can share those maps with anyone else in the content ecosystem.

More: “What Calais does sounds simple—what you do with it could be simply amazing.”

If the world were smart, there would be a gold rush to be first to build the killer app. Mine will be for serving the information needs of communities in a democracy—in a word, news. Who’s coming with me?

PS. Good for Reuters. May its bid to locate itself at the pulsing informational center of the semantic web and the future of news prove as ultimately lucrative as it is profoundly socially benevolent.

Google Reader Counts Past One Hundred

That’s awesome. Whew, I shall remember these halcyon days warmly.

I can’t find the official word, however, so I can’t put a link on offer. You’ll just have to log in and check—if you’re like me and can now fret that the number of posts you have yet to read seems to have leaped by an order of magnitude, now up to “1000+” and beyond.

Actually, It’s great knowing the difference between 103 and 803.

Josh Young's Facebook profile

What I’m thinking

Error: Twitter did not respond. Please wait a few minutes and refresh this page.

What I'm saving.

RSS What I’m reading.

  • An error has occurred; the feed is probably down. Try again later.