Archive for the 'web2.0' Category

That’s one small step for Google, one giant leap for text-audio convergence

So you’ve seen the cult classic youtube video “The Machine Is Us/ing Us.”

It’s mostly about the wonders of hypertext—that it is digital and therefore dymanic. You can remix it, link to it, etc.

But form and content can be separated, and XML was designed to improve on HTML for that reason. That way, the data can be exported, free of constraints.

Google’s now embarked on a mission to free the speech data locked up in youtube videos.

There’s no indication that it’ll publish transcripts, which super too bad, but it’s indexing them and making them searchable. Soon enough every word spoken on youtube will be orders of magntitude more easily located, integrated, and re-integrated, pushed and pulled, aggregated and unbundled.

Consider a few simple innovations borne of such information.

Tag clouds, for instance, of what the english-speaking world is saying every day. If you take such a snapshot every day for a year and animate them, then you get a twisting, turning, winding stream of our hopes and fears, charms and gripes.

Clusters, for another, of videos with similar topics or sentiments. Memetracking could move conversations away from the email-like reply system in youtube to being something more organic and less narrowly linear.

Advertisements, for a last, of a contextual nature, tailored to fit the video without having to rely on human-added metadata.

Wait, announcements, for a very last, of an automated kind. If you create a persistent search of ‘obama pig,’ grab the rss feed, and push it into twitter, then you’re informing the world when your fave presidential candidate says something funny.

The Great Unbundling: A Reprise

This piece by Nick Carr, the author of the recently popular “Is Google Making Us Stupid?” in the Atlantic, is fantastic.

My summary: A print newspaper or magazine provides an array of content in one bundle. People buy the bundle, and advertisers pay to catch readers’ eyes as they thumb through the pages. But when a publication moves online, the bundle falls apart, and what’s left are just the stories.

This may no longer be revolutionary thought to anyone who knows that google is their new homepage, from which people enter their site laterally through searches. But that doesn’t mean it’s not the new gospel for digital content.

There’s only one problem with Carr’s argument, though. By focusing on the economics of production, I don’t think its observation of unbundling goes far enough. Looked at another way—from the economics of consumption and attention—not even stories are left. In actuality, there are just keywords entered into google searches. That’s increasingly how people find content, and in an age of abundance of content, finding it is what matters.

That’s where our under-wraps project comes into play. We formalize the notion of people finding content through simple abstractions of it. Fundamentally, from the user’s perspective, the value proposition lies with the keywords, or the persons of interest, not the piece of content, which is now largely commodified.

That’s why we think it’s a pretty big idea to shift the information architecture of the news away from focusing on documents and headlines and toward focusing on the newsmakers and tags. (What’s a newsmaker? A person, corporation, government body, etc. What’s a tag? A topic, a location, a brand, etc.)

The kicker is that, once content is distilled into a simpler information architecture like ours, we can do much more exciting things with it. We can extract much more interesting information from it, make much more valuable conclusions about it, and ultimately build a much more naturally social platform.

People will no longer have to manage their intake of news. Our web application will filter the flow of information based on their interests and the interests of their friends and trusted experts, allowing them to allocate their scarce attention most efficiently.

It comes down to this: Aggregating documents gets you something like Digg or Google News—great for attracting passive users who want to be spoon fed what’s important. But few users show up at Digg with a predetermined interest, and that predetermined interest is how google monetized search ads over display ads to bring yahoo to its knees. Aggregating documents make sense in a document-scarce world; aggregating the metadata of those documents makes sense in an attention-scarce world. When it comes to the news, newsmakers and tags comprise the crucially relevant metadata, which can be rendered in a rich, intuitive visualization.

Which isn’t to say that passive users who crave spoon-fed documents aren’t valuable. We can monetize those users too—by aggregating the interests of our active users and reverse-mapping them, so to speak, back onto a massive set of documents in order to find the most popular ones.

Whither Tag Clouds?

A few weeks ago, one could do relatively little clicking around the interwebs and notice the tear of pretty tag clouds powered by wordle. Bloggers of all stripes posted a wordle of their blog. Some, like Jeff Jarvis, mused about how the visualizations represent “another way way to see hot topics and another path to them.”

For as long as tag clouds have been a feature of the web, they’ve also been an object of futurist optimism, kindling images of Edward Tufte and notions that if someone could just unlock all those dense far-flung pages of information, just present them correctly, illumed, people everywhere would nod and understand. Their eyes would grow bright, and they would smile at the sheer sense it all makes. The headiness of a folksonomy is sweet for an information junkie.

It’s in that vein that ReadWriteWeb mythologizes the tag cloud as “buffalo on the pre-Columbian plains of North America.” A reader willing to cock his head and squint hard enough at the image of tag clouds “roaming the social web” as “huge, thundering herds of keywords of all shades and sizes” realizes that the Rob Cottingham would have us believe that tag clouds were graceful and defenseless beasts—and also now on the verge of extinction. He’s more or less correct.

I used to mythologize the tag cloud, but let’s be honest. They were never actually useful. You could never drag and drop one word in a tag cloud onto another to get the intersection or union of pages with those two tags. You could never really use a tag cloud to subscribe to RSS feeds of only the posts with a given set of tags.

A tag also never told you whether J.P. Morgan was a person or a bank. A tag cloud on a blog was never dynamic, never interactive. The tag cloud on one person’s blog never talked to the tag cloud on anyone else’s. I could never click on one tag and watch the cloud reform and show me only related tags, all re-sized and -colored to indicate their frequency or importance only in the part of the corpus in which the tag I clicked on is relevant.

But there’re also a cool-headed thoughts to have here. If tag clouds don’t work, what will? What is the best way to navigate around those groups of relatively many words called articles or posts? In the comments to Jarvis’s post, I asked a set of questions:

How will we know when we meet a visualization of the news that’s actually really useful? Can some visualization of the news lay not just another path to the “hot topic” but a better one? Or will headlines make a successful transition from the analog past of news to its digital future as the standard way we find what we want to read?

I believe the gut-level interest in tag clouds comes in part from the sense that headlines aren’t the best way to navigate around groups of articles much bigger than the number in a newspaper. There’s a real pain point there: scanning headlines doesn’t scale. Abstracting away from them, however, and focusing on topics and newsmakers in order to find what’s best to read or watch just might work.

I think there’s a very substantial market for a smarter tag cloud. They might look very different from what we’ve seen, but they will let us see at a glance lots of information and help us get to the best stuff faster. After all, the articles we want to read, the videos we want to watch, and the conversations we want to have around them are what’s actually important.

In “Plug ‘n’ Play” Journalism, Plugs Very Expensive

Pondering the Future of News, Steve Boriss asks, “if it is financially viable for TV networks to embed advertising into individual programs then make them available for download on any web site, why can’t independent reporters do the same with their stories, particularly those involving video?”

The answer is very high barriers to entry that come by way of transaction costs, which are just too high relative to potential ad revenue for it to make sense for buyers of ads. There’s a guy—a guy in fancy clothes sitting in his glass-paneled office on Madison Avenue—who’s got to make decisions about the best uses of his time. It doesn’t make sense for him to buy $1000 in ads from Joe Regular.

That’s where advertising networks come in. Putting google ads on your site will help you out, for instance. For the time being, however, google pays too little because it has little competition but also, more importantly, because its targeting isn’t good enough yet. The ads aren’t actually worth enough.

Coase-based advertising in which users auction their attention stream to advertisers by way of a grand platform is the way to go.

Digg Adds Depth

Digg just added social networking to its position as the leading player in submit-and-vote news! Yes, Digg added the second component of networked news to the first.

I’m not sure enough many people will have enough friends to end up caring so much more about what they think about the news than what the universe of diggers thinks about the news. I, for one, as a twenty-something workaday guy, just don’t know enough people who use Digg to slurp up their news efficiently.

But maybe there are fifteen-year-olds who use Digg to get all their news. And maybe there are enough who have lots of other friends who use Digg similarly. If so, the submit-and-vote version of the first component of networked news could be on its way.

Many people, including me, don’t use Digg because its content—often dominated, they say, by upper-middle class geeky white dudes—just doesn’t cut it. I’ll stick with hours upon hours in front of google reader, backed up by aideRSS, of course. But with networks of friends, like-minded intellectuals, no doubt, Digg could really scratch my itch for content on the impending collapse of the dollar or Barack Obama’s position on chatting with foreign leaders or this conference I want to go to badly. (They say there’s so little room! They say Dave Winer may show!)

Anyhow, when are we going to be able to digg stories from outside digg.com? When am I going to install on my facebook profile a digg application, in which I can choose to see everyone’s diggs, just my friends’ diggs, just diggs of certain topics, just my diggs going back through history, etc.? When, indeed, am I going to be able to vote from facebook? Stick an ad in your widget and be done with it, Mr. Rose, who’s a near-hero of mine, for his lack of technical skills, mostly. (He paid a guy—someone else, someone who could code—ten bucks an hour to develop the site.)

PS. Mr. Cohn, toss me an invitation to the conference you and Mr. Jarvis are doing God’s, or at least the Republic’s, work to organize! And ask the top diggers whether they think, or under what conditions, they think their role could shrink because people like me would shift our attention away from the Digg homepage to our own friend-centered niches by way of Digg’s bringing on the second component of networked news!

Loving aideRSS

Tough love, that is—there’s a lot more I want out of this.

But first, aideRSS is awesome. When I serve it a blog’s feed, it looks at how many comments, delicious saves, and other mentions each post has and then divides them up according to their popularity relative to one another. AideRSS offers me a feed for each division—the smallest circle of the “best posts,” a larger circle of “great posts,” and an even larger circle of “good posts.”

I’ve got two main uses for it. It ups the signal-to-noise ratio on blogs that aren’t worth reading in their filtered state, given my peculiar tastes. And it allows me to keep current with the most popular posts of blogs I don’t have time to read every single day. That’s huge.

There are real problems, however, and other curious behaviors.

Consider Marc Andreessen’s blog pmarca. For one, AideRSS strips out his byline (here’s the “good” feed). For two, it has recently really oddly clipped his most recent posts and made them partial feeds (I also follow Andreessen’s full feed, and it is still full). Also, aideRSS also seems to strip out all the original dates and replace them with some date of its own.

That’s a problem. Google Reader published Andreessen’s post called “Fun with Hedge Funds: Catfight!” on August 16, 2007. But it’s the most recent post in AideRSS’s filtered feed of Andreessen’s “good” posts. The problem is that it follows “The Pmarca Guide to Startups, part 8” in the “good” feed but precedes it in the regular feed.

Did the post about the hedge funds and the cat fight receive some very recent comments, more than a few days after it was first published? All else equal, it wouldn’t be a problem to have the posts out of order—that would seem to be the sometimes inevitable result of late-coming comments or delayed delicious saves, etc. But all else is not equal—because the original dates are stripped. Posts in a blog exist relative to one another in time. Stripping out the dates and then reordering the posts smothers those important relationships.

But let’s look to the horizon. AideRSS can’t handle amalgamated feeds. I want to serve it what Scoble calls his link blog—the feed of all the very many items he shares in Google Reader—and receive only the most popular. That way, I would get the benefit of two different kinds of networked news at once. I’d get the intersection of the crowd’s opinion and the trusted expert’s opinion.

I’d also like to serve it a big mashup of lots of feeds—say, my favorite five hundred, routed through Pipes—and have it return the top two percent of all posts. That kind of service could compete with Techmeme, but it could be dynamic. We could all build our own personalized versions of Techmeme. That would be huge.

Trying it out a few different ways gave wild results. The posts in an amalgamated feed weren’t exactly being compared to one another on a level playing field—so that even a relatively bad TechCrunch post with ten comments crushes an small-time blogger’s amazing post with eight comments. But they also weren’t being compared to one another only by way of their numerical rankings derived from their first being compared to the other posts in their original feed.

Why can’t aideRSS measure each post’s popularity with respect to its kin even when it’s among strangers? The share function within Google Reader gives aideRSS the original url for each post. Can’t aideRSS take the original url for each post, find the original feed for each post, and then analyze each post against the other posts in its original feed? That would be much more analysis, for sure, but it would also be much more valuable. I’d love to see it.

Of course, while it may be a surprise or unintuitive at first, all this is really just one particular take on the first and second components of networked news—pulling in your news from a network of publishers and from a network of readers, including friends and experts and others. Without my additions, aideRSS represents just the second component, in which we get news based on whether others are reading it and participating in the conversation around it. My additions bring a little of the first component.

UPDATE: It would also be awesome to serve aideRSS the feed generated by a WordPress tag or by a persistent Google News search. That would be bringing in a shade of the third component of networked news.

Breaking Content, Building Conversation

Deep down, what makes the new kind of debate from the Huffington Post, Slate, and Yahoo! actually really exciting is the extent to which it represents the third component of networked news.

What, again, is the third component of networked news? It’s a data-driven network of the people and the issues in the news.

Although very limited in scale, this example of being able to slice and dice a stodgy debate is amazingly powerful. Jarvis knows it. He groks how this means a “conversation”—a free-flowing exchange of information among people along a topic or around some substance of interest to everyone involved, both the speakers or writers and the listeners or readers. As I’ve noted before, I think Jarvis also, at some level, gets the importance of structuring the news around the people who are in it and who consume it and interact with it.

That’s what this is. Once the candidates have had their chances, we listeners get to pull apart their interviews, re-arrange them, and piece together a conversation, organized by issue. We can ignore candidates and focus on others. We can focus on Iraq, or maybe even withdrawal from Iraq, or we can weave in and out of interrelated topics, like, say, security and civil rights or single-payer health care and taxes, comparing each candidate’s self-consistency and comparing them all to one another. (I’m for security and civil rights and single-payer health care and taxes.)

This is awesome. Huffington Post is blowing up. For realz.

To bring in the first two components of networked news, HuffPo and co would have to give us the tools to weave in our own video clips and then let us share them with one another as variously trusting members of a community.

Let me juxtapose my own counterarguments to a windbag’s dissembling. Or let me loose some praise on another candidate’s courage. For that matter, let me juxatpose my praise for a candidate’s courage with another citizen’s attack on that same candidate’s cowardice. Let us argue with one another—and do it alongside the evidence.

And then let us, users and consumers, mixers and contributors, define relationships among one another. Let us grow our relationships. Let me read some smart midwesterner’s opinions on farm subsidies and then let me subscribe only to his agriculture-related content. Or let me take a wide-angle view of the network of conversations we citizens are having. Let me find out how many people really care about extraterritorial rendition, or let me get a sense of who wants big government to let them be. Let me check out which clips are the most viewed or most mashed-up.

That would be awesome.


Josh Young's Facebook profile

What I’m thinking

Error: Twitter did not respond. Please wait a few minutes and refresh this page.

What I'm saving.

RSS What I’m reading.

  • An error has occurred; the feed is probably down. Try again later.