Archive for the 'web2.0' Category

That’s one small step for Google, one giant leap for text-audio convergence

So you’ve seen the cult classic youtube video “The Machine Is Us/ing Us.”

It’s mostly about the wonders of hypertext—that it is digital and therefore dymanic. You can remix it, link to it, etc.

But form and content can be separated, and XML was designed to improve on HTML for that reason. That way, the data can be exported, free of constraints.

Google’s now embarked on a mission to free the speech data locked up in youtube videos.

There’s no indication that it’ll publish transcripts, which super too bad, but it’s indexing them and making them searchable. Soon enough every word spoken on youtube will be orders of magntitude more easily located, integrated, and re-integrated, pushed and pulled, aggregated and unbundled.

Consider a few simple innovations borne of such information.

Tag clouds, for instance, of what the english-speaking world is saying every day. If you take such a snapshot every day for a year and animate them, then you get a twisting, turning, winding stream of our hopes and fears, charms and gripes.

Clusters, for another, of videos with similar topics or sentiments. Memetracking could move conversations away from the email-like reply system in youtube to being something more organic and less narrowly linear.

Advertisements, for a last, of a contextual nature, tailored to fit the video without having to rely on human-added metadata.

Wait, announcements, for a very last, of an automated kind. If you create a persistent search of ‘obama pig,’ grab the rss feed, and push it into twitter, then you’re informing the world when your fave presidential candidate says something funny.

The Great Unbundling: A Reprise

This piece by Nick Carr, the author of the recently popular “Is Google Making Us Stupid?” in the Atlantic, is fantastic.

My summary: A print newspaper or magazine provides an array of content in one bundle. People buy the bundle, and advertisers pay to catch readers’ eyes as they thumb through the pages. But when a publication moves online, the bundle falls apart, and what’s left are just the stories.

This may no longer be revolutionary thought to anyone who knows that google is their new homepage, from which people enter their site laterally through searches. But that doesn’t mean it’s not the new gospel for digital content.

There’s only one problem with Carr’s argument, though. By focusing on the economics of production, I don’t think its observation of unbundling goes far enough. Looked at another way—from the economics of consumption and attention—not even stories are left. In actuality, there are just keywords entered into google searches. That’s increasingly how people find content, and in an age of abundance of content, finding it is what matters.

That’s where our under-wraps project comes into play. We formalize the notion of people finding content through simple abstractions of it. Fundamentally, from the user’s perspective, the value proposition lies with the keywords, or the persons of interest, not the piece of content, which is now largely commodified.

That’s why we think it’s a pretty big idea to shift the information architecture of the news away from focusing on documents and headlines and toward focusing on the newsmakers and tags. (What’s a newsmaker? A person, corporation, government body, etc. What’s a tag? A topic, a location, a brand, etc.)

The kicker is that, once content is distilled into a simpler information architecture like ours, we can do much more exciting things with it. We can extract much more interesting information from it, make much more valuable conclusions about it, and ultimately build a much more naturally social platform.

People will no longer have to manage their intake of news. Our web application will filter the flow of information based on their interests and the interests of their friends and trusted experts, allowing them to allocate their scarce attention most efficiently.

It comes down to this: Aggregating documents gets you something like Digg or Google News—great for attracting passive users who want to be spoon fed what’s important. But few users show up at Digg with a predetermined interest, and that predetermined interest is how google monetized search ads over display ads to bring yahoo to its knees. Aggregating documents make sense in a document-scarce world; aggregating the metadata of those documents makes sense in an attention-scarce world. When it comes to the news, newsmakers and tags comprise the crucially relevant metadata, which can be rendered in a rich, intuitive visualization.

Which isn’t to say that passive users who crave spoon-fed documents aren’t valuable. We can monetize those users too—by aggregating the interests of our active users and reverse-mapping them, so to speak, back onto a massive set of documents in order to find the most popular ones.

Whither Tag Clouds?

A few weeks ago, one could do relatively little clicking around the interwebs and notice the tear of pretty tag clouds powered by wordle. Bloggers of all stripes posted a wordle of their blog. Some, like Jeff Jarvis, mused about how the visualizations represent “another way way to see hot topics and another path to them.”

For as long as tag clouds have been a feature of the web, they’ve also been an object of futurist optimism, kindling images of Edward Tufte and notions that if someone could just unlock all those dense far-flung pages of information, just present them correctly, illumed, people everywhere would nod and understand. Their eyes would grow bright, and they would smile at the sheer sense it all makes. The headiness of a folksonomy is sweet for an information junkie.

It’s in that vein that ReadWriteWeb mythologizes the tag cloud as “buffalo on the pre-Columbian plains of North America.” A reader willing to cock his head and squint hard enough at the image of tag clouds “roaming the social web” as “huge, thundering herds of keywords of all shades and sizes” realizes that the Rob Cottingham would have us believe that tag clouds were graceful and defenseless beasts—and also now on the verge of extinction. He’s more or less correct.

I used to mythologize the tag cloud, but let’s be honest. They were never actually useful. You could never drag and drop one word in a tag cloud onto another to get the intersection or union of pages with those two tags. You could never really use a tag cloud to subscribe to RSS feeds of only the posts with a given set of tags.

A tag also never told you whether J.P. Morgan was a person or a bank. A tag cloud on a blog was never dynamic, never interactive. The tag cloud on one person’s blog never talked to the tag cloud on anyone else’s. I could never click on one tag and watch the cloud reform and show me only related tags, all re-sized and -colored to indicate their frequency or importance only in the part of the corpus in which the tag I clicked on is relevant.

But there’re also a cool-headed thoughts to have here. If tag clouds don’t work, what will? What is the best way to navigate around those groups of relatively many words called articles or posts? In the comments to Jarvis’s post, I asked a set of questions:

How will we know when we meet a visualization of the news that’s actually really useful? Can some visualization of the news lay not just another path to the “hot topic” but a better one? Or will headlines make a successful transition from the analog past of news to its digital future as the standard way we find what we want to read?

I believe the gut-level interest in tag clouds comes in part from the sense that headlines aren’t the best way to navigate around groups of articles much bigger than the number in a newspaper. There’s a real pain point there: scanning headlines doesn’t scale. Abstracting away from them, however, and focusing on topics and newsmakers in order to find what’s best to read or watch just might work.

I think there’s a very substantial market for a smarter tag cloud. They might look very different from what we’ve seen, but they will let us see at a glance lots of information and help us get to the best stuff faster. After all, the articles we want to read, the videos we want to watch, and the conversations we want to have around them are what’s actually important.

In “Plug ‘n’ Play” Journalism, Plugs Very Expensive

Pondering the Future of News, Steve Boriss asks, “if it is financially viable for TV networks to embed advertising into individual programs then make them available for download on any web site, why can’t independent reporters do the same with their stories, particularly those involving video?”

The answer is very high barriers to entry that come by way of transaction costs, which are just too high relative to potential ad revenue for it to make sense for buyers of ads. There’s a guy—a guy in fancy clothes sitting in his glass-paneled office on Madison Avenue—who’s got to make decisions about the best uses of his time. It doesn’t make sense for him to buy $1000 in ads from Joe Regular.

That’s where advertising networks come in. Putting google ads on your site will help you out, for instance. For the time being, however, google pays too little because it has little competition but also, more importantly, because its targeting isn’t good enough yet. The ads aren’t actually worth enough.

Coase-based advertising in which users auction their attention stream to advertisers by way of a grand platform is the way to go.

Digg Adds Depth

Digg just added social networking to its position as the leading player in submit-and-vote news! Yes, Digg added the second component of networked news to the first.

I’m not sure enough many people will have enough friends to end up caring so much more about what they think about the news than what the universe of diggers thinks about the news. I, for one, as a twenty-something workaday guy, just don’t know enough people who use Digg to slurp up their news efficiently.

But maybe there are fifteen-year-olds who use Digg to get all their news. And maybe there are enough who have lots of other friends who use Digg similarly. If so, the submit-and-vote version of the first component of networked news could be on its way.

Many people, including me, don’t use Digg because its content—often dominated, they say, by upper-middle class geeky white dudes—just doesn’t cut it. I’ll stick with hours upon hours in front of google reader, backed up by aideRSS, of course. But with networks of friends, like-minded intellectuals, no doubt, Digg could really scratch my itch for content on the impending collapse of the dollar or Barack Obama’s position on chatting with foreign leaders or this conference I want to go to badly. (They say there’s so little room! They say Dave Winer may show!)

Anyhow, when are we going to be able to digg stories from outside digg.com? When am I going to install on my facebook profile a digg application, in which I can choose to see everyone’s diggs, just my friends’ diggs, just diggs of certain topics, just my diggs going back through history, etc.? When, indeed, am I going to be able to vote from facebook? Stick an ad in your widget and be done with it, Mr. Rose, who’s a near-hero of mine, for his lack of technical skills, mostly. (He paid a guy—someone else, someone who could code—ten bucks an hour to develop the site.)

PS. Mr. Cohn, toss me an invitation to the conference you and Mr. Jarvis are doing God’s, or at least the Republic’s, work to organize! And ask the top diggers whether they think, or under what conditions, they think their role could shrink because people like me would shift our attention away from the Digg homepage to our own friend-centered niches by way of Digg’s bringing on the second component of networked news!

Loving aideRSS

Tough love, that is—there’s a lot more I want out of this.

But first, aideRSS is awesome. When I serve it a blog’s feed, it looks at how many comments, delicious saves, and other mentions each post has and then divides them up according to their popularity relative to one another. AideRSS offers me a feed for each division—the smallest circle of the “best posts,” a larger circle of “great posts,” and an even larger circle of “good posts.”

I’ve got two main uses for it. It ups the signal-to-noise ratio on blogs that aren’t worth reading in their filtered state, given my peculiar tastes. And it allows me to keep current with the most popular posts of blogs I don’t have time to read every single day. That’s huge.

There are real problems, however, and other curious behaviors.

Consider Marc Andreessen’s blog pmarca. For one, AideRSS strips out his byline (here’s the “good” feed). For two, it has recently really oddly clipped his most recent posts and made them partial feeds (I also follow Andreessen’s full feed, and it is still full). Also, aideRSS also seems to strip out all the original dates and replace them with some date of its own.

That’s a problem. Google Reader published Andreessen’s post called “Fun with Hedge Funds: Catfight!” on August 16, 2007. But it’s the most recent post in AideRSS’s filtered feed of Andreessen’s “good” posts. The problem is that it follows “The Pmarca Guide to Startups, part 8” in the “good” feed but precedes it in the regular feed.

Did the post about the hedge funds and the cat fight receive some very recent comments, more than a few days after it was first published? All else equal, it wouldn’t be a problem to have the posts out of order—that would seem to be the sometimes inevitable result of late-coming comments or delayed delicious saves, etc. But all else is not equal—because the original dates are stripped. Posts in a blog exist relative to one another in time. Stripping out the dates and then reordering the posts smothers those important relationships.

But let’s look to the horizon. AideRSS can’t handle amalgamated feeds. I want to serve it what Scoble calls his link blog—the feed of all the very many items he shares in Google Reader—and receive only the most popular. That way, I would get the benefit of two different kinds of networked news at once. I’d get the intersection of the crowd’s opinion and the trusted expert’s opinion.

I’d also like to serve it a big mashup of lots of feeds—say, my favorite five hundred, routed through Pipes—and have it return the top two percent of all posts. That kind of service could compete with Techmeme, but it could be dynamic. We could all build our own personalized versions of Techmeme. That would be huge.

Trying it out a few different ways gave wild results. The posts in an amalgamated feed weren’t exactly being compared to one another on a level playing field—so that even a relatively bad TechCrunch post with ten comments crushes an small-time blogger’s amazing post with eight comments. But they also weren’t being compared to one another only by way of their numerical rankings derived from their first being compared to the other posts in their original feed.

Why can’t aideRSS measure each post’s popularity with respect to its kin even when it’s among strangers? The share function within Google Reader gives aideRSS the original url for each post. Can’t aideRSS take the original url for each post, find the original feed for each post, and then analyze each post against the other posts in its original feed? That would be much more analysis, for sure, but it would also be much more valuable. I’d love to see it.

Of course, while it may be a surprise or unintuitive at first, all this is really just one particular take on the first and second components of networked news—pulling in your news from a network of publishers and from a network of readers, including friends and experts and others. Without my additions, aideRSS represents just the second component, in which we get news based on whether others are reading it and participating in the conversation around it. My additions bring a little of the first component.

UPDATE: It would also be awesome to serve aideRSS the feed generated by a WordPress tag or by a persistent Google News search. That would be bringing in a shade of the third component of networked news.

Breaking Content, Building Conversation

Deep down, what makes the new kind of debate from the Huffington Post, Slate, and Yahoo! actually really exciting is the extent to which it represents the third component of networked news.

What, again, is the third component of networked news? It’s a data-driven network of the people and the issues in the news.

Although very limited in scale, this example of being able to slice and dice a stodgy debate is amazingly powerful. Jarvis knows it. He groks how this means a “conversation”—a free-flowing exchange of information among people along a topic or around some substance of interest to everyone involved, both the speakers or writers and the listeners or readers. As I’ve noted before, I think Jarvis also, at some level, gets the importance of structuring the news around the people who are in it and who consume it and interact with it.

That’s what this is. Once the candidates have had their chances, we listeners get to pull apart their interviews, re-arrange them, and piece together a conversation, organized by issue. We can ignore candidates and focus on others. We can focus on Iraq, or maybe even withdrawal from Iraq, or we can weave in and out of interrelated topics, like, say, security and civil rights or single-payer health care and taxes, comparing each candidate’s self-consistency and comparing them all to one another. (I’m for security and civil rights and single-payer health care and taxes.)

This is awesome. Huffington Post is blowing up. For realz.

To bring in the first two components of networked news, HuffPo and co would have to give us the tools to weave in our own video clips and then let us share them with one another as variously trusting members of a community.

Let me juxtapose my own counterarguments to a windbag’s dissembling. Or let me loose some praise on another candidate’s courage. For that matter, let me juxatpose my praise for a candidate’s courage with another citizen’s attack on that same candidate’s cowardice. Let us argue with one another—and do it alongside the evidence.

And then let us, users and consumers, mixers and contributors, define relationships among one another. Let us grow our relationships. Let me read some smart midwesterner’s opinions on farm subsidies and then let me subscribe only to his agriculture-related content. Or let me take a wide-angle view of the network of conversations we citizens are having. Let me find out how many people really care about extraterritorial rendition, or let me get a sense of who wants big government to let them be. Let me check out which clips are the most viewed or most mashed-up.

That would be awesome.

News Graph?

Mark Zuckerberg once upon a time extolled facebook and told us about this thing called a “social graph.” Bernard Lunn has just talked about an “innovation graph.”

What about a “news graph”? Hubs and spokes—call them nodes and bridges.

Nodes are the people who are the subjects of the news. Like Karl Rove or Paris Hilton or Chuck Prince. Maybe nodes can also be groups of people acting as a single agent. Like the 100th Congress or the Supreme Court or maybe even something really big like Disney Corp.

Bridges are the news issues connecting the people to whom they are relevant. Here, the bridges have substance apart from mere connection. It would be like a social graph having connections indicating different kinds of friendship—a solid line for a great friend maybe, and a dashd line for a business acquaintance. Think of bridges like tags, just like those you might in delicious. You find a piece of news, which comes in the form of a newspaper article or a blog post, for example, and you assign issue-tags to it. Then, in turn, you assign that article or post to the people-nodes whom it discusses. The issue-tags flow through the article to the people-nodes to which the article or post is assigned; the pieces of news fall out of this picture of the news graph.

When people-nodes have issue-tags thus associated with them, we can indicate when certain people-nodes share certain issue-tags. If we represent those shared characteristics with bridges that connect the people-nodes, we’re graphing the people in the news and the substantive issues that bind them all up into the story of the world at some slice in time.

Just note once more how the pieces of the new—the bits of content, as I call them—fall away and liberate the news and the people and issues it comprises from the narrow confines created by the printing press and furthered by HTML. (Check out Jarvis’s more than mildly inspiring post.) This kind of news graph would, at long last, make the bits of content contigent on the people and the issues they discuss. It’s the elegant organization for news.

This is, by the way, the third component of networked news. This is the data-driven network of the people and the issues in the news.

Facebook Hacked? My Identity Too?

So very many people use facebook. So very many people, though they may not realize it, rely on facebook to establish a real presence of themselves for others to see. That presence happens to be online, but no matter. It’s identity.

That’s why it may shock very many people that they have put their identities in the hands of a private company—one that seeks profit, naturally enough—the guts of whose website has been revealed. Techcrunch says that just “a quick glance” reveals “hidden aspects of the platform” that “give a potential attacker a good head start.” That said, many of the comments on that post take the whole thing to be a hoax.

Anyhow, note that facebook seems to have a comment at Techcrunch verifying a problem. If the comment had been left at Google News, would there be any doubt?

It doesn’t make much sense to wonder whether Web 2.0 projects like facebook are “due” for some wildly major breach, for lots of reasons, like the fact that no particular person with a facebook profile is due for such a serious intrusion. So far, so good….

Grokky Jarvis Has Something to Say about the News

Jeff JarvisJarvis is correct. Equally important in the land of new ideas, however, is vivid articulation. Write the truth, and write it well.

And so he does. To wit, “Like most everyone else chasing this golden fleece, I’ve defined [hyperlocal news] as content, news, a product, listings, data, software, sites, ads. It’s not. Local is people: who knows what, who knows whom, who’s doing what (and, yes, who’s doing whom). The question should be…how we bring them elegant organization.” That’s Zuckerberg’s term—elegant organization. Jarvis likes it a lot. He’ll tell you about it too.

“I now believe that he who figures out how to help people organize themselves,” Jarvis continues, “letting them connect with one other and what they all know, will end up with news, listings, reviews, data, gossip, and more as byproducts.”

I’ll take it from here. The generally news-based web application must organize its information around functional units that are most relevant to the subject. When the subject is news, the most relevant functional units are people and and issues and organizations. Note that a full thirty percent of google searches are for people, for instance, says Jaideep Singh, CEO of Spock in this July 2 PodTech video. Also note the proliferation of person-centric search engines, like Spock, ZabaSearch, Pipl, PeekYou, and Ligit, to name a quick few.

Today, however, the news is still fundamentally organized around its content, its tiny bits of content, its data, whether those be newspaper articles, blog posts, podcasts, or webpages. That organization—in which people and issues are contingent upon the bits of content that discuss them—is a relic of paper and, just as important, html. The article has taken the story hostage. That must be turned on its head: the bits of content must be contingent on the people they discuss. The people, and also the issues, who constitute the story, as it were, must be liberated from the confines of the article. That’s the promise the internet makes to journalism in the twenty-first century. That’s the promise the database makes to news.

A newspaper article will get broken into pieces, like legos that interlock: “little objects,” as Scoble once called them. Those objects will be stored individually, deployable individually, graphable individually. Individually, but not alone. They will live in cells among millions of others cells, part of semantic hive buzzing with the fervor of the world’s news. Or at least the world’s news according the internet.

By slicing up the data, by breaking up the data, we can put it back together. Only we can put it back together however we like, as individuals and as a collective—confident in our ability to tell whatever story may yet be lurking in the interstices of modern journalism. Blogs created an army of journalists. The web needs an application that will arm a legion of editors, each driven largely by their own individual tastes for consuming news but cooking up social feast of intelligent information.

Jarvis again: “People, not content. People, not data. People, not software.” Wait, not software? Getting a bit carried away is a small price to pay for generating so much momentum in the first place.


Josh Young's Facebook profile

What I'm saving.

RSS What I’m reading.

  • An error has occurred; the feed is probably down. Try again later.