The “Content” API

Most of the APIs we think about are data driven. Developers can use the Facebook API to build apps that run in Facebook. Developers can use the Twitter API to build applications that leverage the twitter user base and content. Developers can use the AMEE API to build carbon footprinting applications.

So it was interesting to me to see that the New York Times announced the availability of their first API this week. Their first API is also really data, the aggregated set of campaign finance data their reporters are using to write about the US presidential election.

But I suspect that the Times will start to explore how to turn their content into an API as well. Imagine if you could get access to all of the stories written about the Empire State Building via an API. Or if you could get all the stories written about Mike Bloomberg.

Of course, content is data, but it’s a bit different. Content is unstructured data with the benefits of a lot of context, semantics, relationships. Once the vast databases of content that exist inside the big media companies start becoming available via APIs, we can start to do some amazing things.

We’ve seen a number of companies that have built algorithms using wikipedia data. And the things they’ve built are pretty powerful. But if instead of being limited to wikipedia, they could use dozens of highly trusted, accurate content sources, they could probably do much more.

Slowly but surely the web is becoming more intelligent. I am not sure we’ll ever reach the nirvana of the "semantic web" but we are certainly seeing it become smarter every day and I think "content APIs" will be an important part of how that will happen.

Reblog this post [with Zemanta]