Archive for the 'CollectiveWisdom' Category

The Great Unbundling: A Reprise

This piece by Nick Carr, the author of the recently popular “Is Google Making Us Stupid?” in the Atlantic, is fantastic.

My summary: A print newspaper or magazine provides an array of content in one bundle. People buy the bundle, and advertisers pay to catch readers’ eyes as they thumb through the pages. But when a publication moves online, the bundle falls apart, and what’s left are just the stories.

This may no longer be revolutionary thought to anyone who knows that google is their new homepage, from which people enter their site laterally through searches. But that doesn’t mean it’s not the new gospel for digital content.

There’s only one problem with Carr’s argument, though. By focusing on the economics of production, I don’t think its observation of unbundling goes far enough. Looked at another way—from the economics of consumption and attention—not even stories are left. In actuality, there are just keywords entered into google searches. That’s increasingly how people find content, and in an age of abundance of content, finding it is what matters.

That’s where our under-wraps project comes into play. We formalize the notion of people finding content through simple abstractions of it. Fundamentally, from the user’s perspective, the value proposition lies with the keywords, or the persons of interest, not the piece of content, which is now largely commodified.

That’s why we think it’s a pretty big idea to shift the information architecture of the news away from focusing on documents and headlines and toward focusing on the newsmakers and tags. (What’s a newsmaker? A person, corporation, government body, etc. What’s a tag? A topic, a location, a brand, etc.)

The kicker is that, once content is distilled into a simpler information architecture like ours, we can do much more exciting things with it. We can extract much more interesting information from it, make much more valuable conclusions about it, and ultimately build a much more naturally social platform.

People will no longer have to manage their intake of news. Our web application will filter the flow of information based on their interests and the interests of their friends and trusted experts, allowing them to allocate their scarce attention most efficiently.

It comes down to this: Aggregating documents gets you something like Digg or Google News—great for attracting passive users who want to be spoon fed what’s important. But few users show up at Digg with a predetermined interest, and that predetermined interest is how google monetized search ads over display ads to bring yahoo to its knees. Aggregating documents make sense in a document-scarce world; aggregating the metadata of those documents makes sense in an attention-scarce world. When it comes to the news, newsmakers and tags comprise the crucially relevant metadata, which can be rendered in a rich, intuitive visualization.

Which isn’t to say that passive users who crave spoon-fed documents aren’t valuable. We can monetize those users too—by aggregating the interests of our active users and reverse-mapping them, so to speak, back onto a massive set of documents in order to find the most popular ones.

Citizen Journalism Milestone

There isn’t a better account of all sides of an episode—any episode—of this “uncharted” thing Jay Rosen calls “citizen journalism.” Superlative.

Give me tags, Calais!

Who needs to think about buying tags when Reuters and its newly acquired company are giving them away?

The web service is free for commercial and non-commercial use. We’ve sized the initial release to handle millions of requests per day and will scale it as necessary to support our users.

I mean, Jesus, it’s so exciting and scary (!) all at once:

This metadata gives you the ability to build maps (or graphs or networks) linking documents to people to companies to places to products to events to geographies to … whatever. You can use those maps to improve site navigation, provide contextual syndication, tag and organize your content, create structured folksonomies, filter and de-duplicate news feeds or analyze content to see if it contains what you care about. And, you can share those maps with anyone else in the content ecosystem.

More: “What Calais does sounds simple—what you do with it could be simply amazing.”

If the world were smart, there would be a gold rush to be first to build the killer app. Mine will be for serving the information needs of communities in a democracy—in a word, news. Who’s coming with me?

PS. Good for Reuters. May its bid to locate itself at the pulsing informational center of the semantic web and the future of news prove as ultimately lucrative as it is profoundly socially benevolent.

Another boring personalized news service

I love seeing more and more copycat “intelligent” personalized news sites. The good news is that means that there are funders out there who still know in their gut that there’s money to be made on innovation in the news business. They just need the one idea that will stick. And go pop.

Meantime, more than a six months ago, Mike Arrington wrote about a site called Thoof. Back then, I was also writing and thinking about Streamy and FeedEachOther and other unmemorable twists on feed readers and personalized news sites. No matter their differences, they all seem the same. I just came across yet another—Tiinker—and I just can’t bear it any more.

In his write-up of Thoof, Arrington frames the debate as taking place between two competing positions. He believes that “the masses want popular news,” while the Thoof’s CEO believes that “the masses want tailored news.”

I think they’re both wrong and come at the issue the wrong way.

People want their news based on others’ interests—specialized news from friends (those who have similar interests) and widely popular news from the masses (everyone else). And they want their news based on their own interests, even if their friends don’t share those interests.

Now suppose there’s a continuum of users—from RegularJoe on one end to PowerUser on the other.

RegularJoe wants his news from other people. Although he has relatively few “friends” online, and is thinly connected to the ones he has, he wants them to put in most of the effort to help him get specialized news. (He likes read the “Most Emailed” news articles but doesn’t email them, or he likes visiting Digg but doesn’t log in and vote.) RegularJoe is mostly interested in widely popular news.

PowerUser is different and wants his news mostly based on his own interests. But it would be a mistake to think that he pursues his interests alone (no man is an island, says Donne). He has relatively many friends and enjoys pushing and pulling mutually interesting news to and from them. Of course, PowerUser also has news interests that his friends don’t share or don’t share as strongly, and so he pursue his news independently from his friends as well. Because he enjoys consuming a lot of information, moreover, PowerUser is also interested in widely popular news (he wants to keep his finger to the pulse).

These purely black-box algorithmic personalized news sites don’t really fit either guy.

RegularJoe: They’re too hardcore for RegularJoe. He doesn’t want his own news because his interests just aren’t sufficiently deeply cultivated. RegularJoe isn’t motivated enough to build up a profile by clicking “thumbs up” all the time (as tiinker would have him). When he is motivated enough, he isn’t sufficiently consistent over time for these fancy algorithms to get him what he wants before he strays back to cnn.com because it’s easier to let someone else decide (a person-editor, in this case).

PowerUser: They’re too secret for PowerUser. He wants to put in more effort cultivating his interests and doesn’t want to trust an (anti-social) algorithm from some start-up that might disappear tomorrow. PowerUser also wants to get specialized news from niche groups of friends. For him, the fact that friends X, Y, and Z read some blog post makes it inherently more interesting because they can have a conversation about it (broadly speaking). The personalized news sites just aren’t sufficiently social for the PowerUser who wants to interact with friends around the news.

This isn’t meant to be a slam-dunk argument. I’m not sure about what happens with the group of users who are in the hypothetical middle of the continuum. Maybe there’s some number of users (1) who care enough about the news to have non-trivial interests that don’t shift or fade over time but (2) who also don’t care very much for a transparent or social experience of the news. Ultimately, however, I really doubt that this group of users is big enough to support this kind of personalized news site.

Give in order to get

We must loosen our grasp on our written property in order to keep it from slipping out of our hands entirely.

See here: “Design and presentation, eventually, won’t matter. Your core content still will.”

The conclusion is maybe one step more dramatic—because it’s important to stress that your “core” content is only any given part of your overall content. Expect another company not only to re-design and re-present your content but also to select only chunks or slivers, re-ordered and even re-mixed or re-worded.

But how to make money from content, then, if everyone’s pilfering and spinning? One answer is Attributor, which takes a “fingerprint” of your text. A system of presentation (say, a future newspaper’s own site) could then analyze any piece of content (say, article A) and detect whether it’s 50 percent original, 25 percent article X, and 25 percent article Y. The system could then share only half as much advertising revenue with the writer of article A and divert the rest in equal parts to the writers of X and Y. No harm, no foul. The writer of A didn’t necessarily plagiarize in any normatively bad sense, but article A is only half his, and it’s one-quarter X’s and one-quarter Y’s. They get their due, fair and square.

Or the writers of X and Y can serve a take-down notice. But if they do, they give up their 25 percent of the shared advertising. It’s just economics—remixed economics that may let us breathe a sigh of relief, loosen our death grip on our precious but ultimately fungible words, and begin to make profit and a living nevertheless.

If writers X and Y want the shared advertising revenue from our future newspaper, they must agree to let others replicate it. Some writers won’t, especially at first. But the smarter writers will.

Re-use is no blow to their writerly esteem. Their original works are no less poetic (or, more likely, godawful) because others spin them into new forms. We are not gawking through a looking glass into another world where re-use is an anonymous, miscegenated norm. Let us trust that the White Album won’t vanish, that the Blank Album won’t dissolve, and that the Gray Album will be a good listen—in its own right, appropriately so. Thank you, Mr. Beatles. Much obliged, Mr. Z. And a job well done to you, Mr. Mouse. Authorship is too human for us to ignore it when someone tells us something, his story, his story.

Could there be Attributor equivalents for audio and video?

Sell me tags, Twine!

How much would, say, the New York Times have to pay to have the entirety of its newspaper analyzed and annotated every day?

The question is not hypothetical.

The librarians could go home, and fancy machine learning and natural language processing could step in and start extracting entities and tagging content. Hi, did you know Bill Clinton is William Jefferson Clinton but not Senator Clinton?! Hey there, eh, did you know that Harlem is in New York City?! Oh, ya, did you know that Republicans and Democrats are politicians, who are the silly people running around playing something called politics?!

Twine could tell you all that. Well, they say they can, but they won’t invite me to their private party! And maybe the librarians wouldn’t have to go home. Maybe they could monitor (weave?) the Twine and help it out when it falls down (frays?).

I want to buy Twine’s smarts, its fun tags. I’d pay a heckuva lot for really precociously smart annotation! They say, after all, that it will be an open platfrom from which we can all export our data. Just, please, bloat out all my content with as much metadata as you can smartly muster! Por favor, sir! You are my tagging engine—now get running!

What if Twine could tag all the news that’s fit to read? It would be a fun newspaper. Maybe I’d subscribe to all the little bits of content tagged both “Barack Obama” and “president.” Or maybe I’d subscribe to all the local blog posts and newspaper articles and videos tagged “Harlem” and “restaurant”—but only if those bits of content were already enjoyed by one of my two hundred closest friends in the world.

I’d need a really smart and intuitive interface to make sense of this new way of approaching the news. Some online form of newsprint just wouldn’t cut it. I’d need a news graph, for sure.

See TechCrunch’s write-up, Read/Write Web’s, and Nick Carr’s too.

PS. Or I’ll just build my own tagging engine. It’ll probably be better because I can specifically build it to reflect the nature of news.

RE: Telling stories on the Web is like developing software using agile principles

My human-readable un-remixing of Michael Amedeo Tumolillo’s remixing of Alex Iskold’s mix (itself a comment on this awesome book):

In the Web world, stories have ill-defined and constantly evolving requirements, making it impossible to think everything through at once. Instead, the best Web story today is created and evolved using agile methods. These techniques allow journalists to continuously re-align stories with business and customer needs.The Waterfall Model of storytelling, coined in 1970, will not work in such a world. Its idea was to tell stories by first reporting, then creating the story, then editing it, then creating and editing it again, and finally publishing it in one linear sequence.

The Waterfall Model is now considered a flawed method for Web stories because it is so rigid and unrealistic.

Non-storytelling people tend to think that stories are soft or easily changeable. Nope. Stories, like any system, have a design and structure; they are not as soft as they seem.

Yet the accelerating pace of business requires constant changes to storytelling. Using the Waterfall Model, these changes were impossible, the development cycle was too long, stories were over produced engineered and ended up costing a fortune, and often did not work right.

A problem with the Waterfall Model was that in the information jungle, dynamic stories are not told once; they evolve over time in bits and pieces.

Storytelling needed. First, stories have to embrace change. Today’s assumptions and requirements may change tomorrow, and stories need to respond to changes quickly.

The stories created using agile methods are much more successful because they are evolved and adapted to Web customers. Like living organisms, these stories are continuously reshaped to fit the dynamic Web landscape of changing customer attention.

Stories have lots of moving parts, in other words, in the sense that they’re dynamic systems whose parts influence one another.

Tumolillo’s grokking a general point, of course, and I don’t want to read too deep in to his analogy. But one possible issue with this conception of the bits of content writers/publishers produce is that it may still neglect the necessarily short-term economics of the news. Developers can rejigger an application two weeks after its debut because they’re confident that people will still care about the application two weeks thence. Stories are ephemeral—or at least much more so. Life comes and goes.

That’s why I think it makes sense to distinguish between the article and the story:

The article has taken the story hostage. That must be turned on its head: the bits of content must be contingent on the people they discuss. The people, and also the issues, who constitute the story, as it were, must be liberated from the confines of the article. That’s the promise the internet makes to journalism in the twenty-first century. That’s the promise the database makes to news.

You can’t change the stories. Someone’s got to write them—and get paid for them, and move on to writing the next one. That’s partly why blogging was so transformative. Bloggers write something one day. And then they let it stand, never changing it. If they want to elaborate or correct or just revisit, they just write another post and link back.

The story’s in the map! (Or what I called a “news graph” in a fit of facebook exuberance once not too long ago.)

So let us have a web application that brings together the articles that compose a story—all of its sides, elaborations, corrections, and more. I’m looking at you, kindly folks at the Knight News Challenge.

I know I can do it.

PS. Don’t apply now. Sadly, the deadline has passed.

News Graph?

Mark Zuckerberg once upon a time extolled facebook and told us about this thing called a “social graph.” Bernard Lunn has just talked about an “innovation graph.”

What about a “news graph”? Hubs and spokes—call them nodes and bridges.

Nodes are the people who are the subjects of the news. Like Karl Rove or Paris Hilton or Chuck Prince. Maybe nodes can also be groups of people acting as a single agent. Like the 100th Congress or the Supreme Court or maybe even something really big like Disney Corp.

Bridges are the news issues connecting the people to whom they are relevant. Here, the bridges have substance apart from mere connection. It would be like a social graph having connections indicating different kinds of friendship—a solid line for a great friend maybe, and a dashd line for a business acquaintance. Think of bridges like tags, just like those you might in delicious. You find a piece of news, which comes in the form of a newspaper article or a blog post, for example, and you assign issue-tags to it. Then, in turn, you assign that article or post to the people-nodes whom it discusses. The issue-tags flow through the article to the people-nodes to which the article or post is assigned; the pieces of news fall out of this picture of the news graph.

When people-nodes have issue-tags thus associated with them, we can indicate when certain people-nodes share certain issue-tags. If we represent those shared characteristics with bridges that connect the people-nodes, we’re graphing the people in the news and the substantive issues that bind them all up into the story of the world at some slice in time.

Just note once more how the pieces of the new—the bits of content, as I call them—fall away and liberate the news and the people and issues it comprises from the narrow confines created by the printing press and furthered by HTML. (Check out Jarvis’s more than mildly inspiring post.) This kind of news graph would, at long last, make the bits of content contigent on the people and the issues they discuss. It’s the elegant organization for news.

This is, by the way, the third component of networked news. This is the data-driven network of the people and the issues in the news.

What Is Networked News?

Networked news describes a structure for consuming information. It means pulling in your news from a network of publishers—bloggers and traditional news outlets. It means pulling in your news from a network of readers—friends and experts and so on. And, crucially, networked news means breaking down the bits of content into their relevant constitutive pieces and reforming those pieces back into their own network. It means pulling in your news from a data-driven network of the people and the issues in the newspeople like George W. Bush and Steve Jobs and Oprah and issues and memes from “republican” and “iraq war” and “campaign 2008” to “iPhone” to “power of forgiveness.”

The concept of networked news grows out of the realization that the stories we care about exist between one author and another, between articles and blog posts, between newspapers and blogs. The story is a kind of thread that runs through time and in and out of the person-subjects and issue-topics of the news.

Networked news is not networked journalism, which is a structure for publishing information. See pressthink, buzzmachine, and newassignment.net for that parallel “genius” project to grow and diversify the number of sources from which we pull our news.

NewsmapThe first and second components of networked news are new but not unprecedented. Pulling in your news from a network of publishers is what we do when we subscribe to RSS feeds and read them in one place. It’s the river of news I read when I fire up Google Reader, which gives me news about the tech industry, about finance, and about politics. Techmeme, Memeorandum, Google News, and other memetrackers are other great examples of networking news from publishers. Newsmap, based on Google News, is the picture of this first component. Thoof and other news-focused web apps with similar recommendation engines also represent this publisher-based side of networked news.

Pulling in your news from a network of other readers is what Mario Romero is working on with his Google Reader Shared Items application for facebook. It’s also what Digg and others represent.

There are sites that represent both the first and second components of networked news. It’s what Newsvine, Topix, Daylife, and others represent. It’s what Pageflakes, Netvibes, iGoogle and others represent. Though I haven’t actually toyed with the site yet (I’m still waiting on that invite, guys) it looks like Streamy sits at the current bleeding edge of the reader-based front of networked news.

The third component of networked news is, in some ways, the oldest, represented by simple searches to Google News or Technorati tags. It’s also the most difficult component—technically, socially, you name it. When I encourage Mario to let users browse his Google Reader Shared Items by tag, I’m encouraging him to let us readers of news pull in bits of content by issue and meme. When Streamy claims to have “filters”—which I called “substance- and source-based ways browse, and subscribe to, kinds of content, by keyword and original author, respectively”—it’s claiming to have taken a few steps into the this elusive third component of networked news.

networkOne kind representation of this third component, in the form of how Exxon putatively buys scientific research, is graphic. The “story” is the whole visual network, while the actors are broken down and interconnected within it. The bits of content, in this case, come in the form of profiles on each actor pictured. People and foundations are linked up by bridges connecting them. Those bridges, exxonsecrets says, represent the money that Exxon funnels through the foundations to pay the people to conduct and promote bogus climate research. Users can create, manipulate, and save their own graphical network maps for all to see.

A swirl of excited ideas in my head, it’s all rather tough to articulate. But I’ll get to it soon enough, bit by bit.

Spivack Gets the Semantic Web, But the Analogy Eludes

Nova SpivackI simultaneously envy and fret over Nova Spivack’s style. I’m deeply sympathetic to his recent brain metaphor—in no small part because I’m a sucker for the killer analogy. Spivack’s analogy is catchy and seems useful: “I believe that collective intelligence primarily comes from connections—this is certainly the case in the brain where the number of connections between neurons far outnumbers the number of neurons; certainly there is more ‘intelligence’ encoded in the brain’s connections than in the neurons alone.” Then, bringing it home, “Connection technology…is analogous to upgrading the dendrites in the human brain; it could be a catalyst for new levels of computation and intelligence to emerge.” Ultimately, Spivack claims, “By enriching the connections within the Web, the entire Web may become smarter.”

There’s great stuff packed in here—frustratingly great stuff. Is there really more “intelligence” encoded in the brain’s connections than its neurons? What does it mean to believe that collective intelligence comes from connections? Or are we talking tautology (in which “intelligence” + “connections” = “connected” or “collected” or “collective intelligence”)? And what could it ever mean to upgrade, or enrich, our dendrites, the byzantine tree-like conductors of electrical inputs to our neurons? How would we be more intelligent?

Why not rehearse an argument that defends the aptness of this analogy? Why leave that chore—the really hard part—to me, to the reader? Unless they’re trivial or obvious, rigorous analogies alone cannot be more than invitations to real arguments. Don’t invite me to the party and tell me to bring the champagne!

“The important point for this article,” Spivack writes, “is that in this data model rather than there being just a single type of connection”—the present Web’s A-to-B hotlink—”the Semantic Web enables an infinite range of arbitrarily defined connections to be used.” Bits of information, people, and applications “can now be linked with specific kinds of links that have very particular and unambiguous meaning and logical implications. … Connections can carry more meaning, on their own. It’s a new place to put meaning in fact—you can put meaning between things to express their relationships.”

Yes, when connections can carry arbitrarily more meaning, the human-relevant reasons for them to exist grow arbitrarily large—or, at least, as arbitrarily large as we bandwidth-bounded humans can handle. Only this kind of virtuous semantic circle, it seems to me, can radically improve the intelligence of the web as whole. What’s important are not just connections with more meaning (“upgraded” dendrites, I suppose). What’s important is that connections with more meaning promise a blossoming of the total number of connections (more “dendrites”)—each of which can themselves have more meaning.

The web will become more intelligent, or just more useful, when projects like Spivack’s and like Freebase—which I’ve checked out a bit (facebook me for an invitation to the private alpha)—expand the scope of reasons for connections among bits of information, people, and applications. Of course, that’s the whole idea for the semantic web. With more reasons for connections, we get more meaning for connections. With more meaning for connections, we get more connections. In the end, we get more connections with more meaning—a kind of semantic multiplier effect.

It’s just that we’re talking about Internet here. Brains are still a few years out.


Josh Young's Facebook profile

What I'm saving.

RSS What I’m reading.

  • An error has occurred; the feed is probably down. Try again later.