Questions for Open Calais?

So I’m interviewing the folks over Thomson Reuters on Thursday for a piece that should be published at CJR. We’ll being talking about a relatively new service they’re providing freely. That service is called Open Calais, and it does some fancy stuff to plain text.

What fancy stuff? If you send it a news article, Open Calais will give you back the deets—and, way more importantly, it will make them obvious to your computer as well. That’s my description inspired by the Idiot’s guide, anyhow. (Yes, “deets” means “details” to cool kids, so get on board.)

<digression>Basically, the whole point of the semantic web is to make what’s obvious to you also obvious to your computer. For people who have always anthropomorphized their every laptop and piece of software—loved them when they just work, coaxed them when they slow to a crawl, and yelled at them when they grind to a halt—this can be a serious head-scratcher and a boring one at that. I blame Clippy the Microsoft Office Assistant. I also blame super-futuristic sci-fi movies that give us sugar-plum images of computers as pals—bright, sophisticated, and in possession of a knowledge like we epistemologically gifted humans have. Screw Threepio. Finally, I blame that jerk Alan Turing, who fed us the unintuitive half-truth that a computer could be conscious.

So it feels really silly so to say, again, but computers are ones and zeroes, NAND gates and NOR gates. They called computers because they do computation. They don’t do meaning as such. (Oh boy do I hope I get flamed in the comments by someone who knows his way around BsIV way better than I do.)</digression>

Open Calais will pick out people, companies, and places—these are called “named entities.” It will also identify facts and events in articles. Because Thomson Reuters is finance-focused information provider, many of the facts and events it can recognize are about business relationships like subsidiaries and events like bankruptcies, acquisitions, and IPOs. The list goes on and on. Finally, Open Calais will identify very broad categories like politics, business, sports, or entertainment.

Open Calais will also associate these deets with more further information on teh interwebs. So just for instance, if the web service identifies a person in your article, it will give you and your finicky, picky, and ultimately dumb computer a nice pointer to this computer-friendly version of wikipedia called dbpedia. Or if Calais identifies a movie, it will offer a pointer to linked data, as far as I can tell, is still a pretty vague notion. It promises to deliver more than it has to date, and that’s not a derogation.

But why freely—or essentially so in most cases? If you keep within liberal limits, you owe Thomson Reuters no money in exchange. Correct me if I’m wrong, but all they want, more or less, is that you offer them attribution and use their linked-data pointers (they call them URIs). Ken Ellis, chief scientist at Daylife, which may be best known to journalists through its association with Jeff Jarvis, took a stab at answering the “why free?” question:

Thomson Reuters has a large collection of subscription data services. They eventually want to link to these services. Widespread use of Calais increases the ease with which customers can access these subscription data services, ultimately increasing their ability to extract revenue from them.

That sounds to me like Thomson Reuters is interested in making its standards the standards. And that bargain really does sound reasonable. I guess.

But journalists are a wildly skeptical bunch. They’re skeptical—aloof even, way too cool for school and ideology. Journalists have a pretty acute and chronic deficiency in a little thing called trust. Maybe it’s justified, or maybe it’s not. Maybe it’s mostly justified, or maybe it’s mostly unjustified.

Either way, my gut’s telling me that journalists are going to need a fuller narrative from Thomson Reuters about why they should rely on another news and information company. When I talk to Tom and Krista, that’s what I’ll be largely interested in.

And you? What do you want to know about Open Calais. Leave your questions in the comments, and I’ll be sure to ask them.

Advertisements

1 Response to “Questions for Open Calais?”


  1. 1 Roark S 2009 May 14 at 2:41 pm

    If you are a publisher hoping to use Calais…you need to think twice. The future of the web is going to be increasingly destinationless, and the regulatory frameworks around fair use are most certainly going to change. One line in the Calais terms and conditions is worrying:

    “You understand that Thomson Reuters will retain a copy of the metadata submitted by you or that generated by the Calais service. By submitting or generating metadata through the Calais service, you grant Thomson Reuters a non-exclusive perpetual, sublicensable, royalty-free license to that metadata. From a privacy standpoint, Thomson Reuters’ use of this metadata is governed by the terms of the Thomson Reuters and Calais Privacy Statements.”

    Does this mean thomson reuters would be able to take my content and then monetize against it? It’s a great platform, but most certainly a means to a grander vision.


Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s




Josh Young's Facebook profile

What I’m thinking

Error: Twitter did not respond. Please wait a few minutes and refresh this page.

What I'm saving.

RSS What I’m reading.

  • An error has occurred; the feed is probably down. Try again later.

%d bloggers like this: