Thumbnail image for calais-logo.JPG
Semantic web technology provider Calais plans to commence beta testing for the next version of its OpenCalais Web Service in August 2012. New features will include behind-the-scenes improvements to the processing pipeline as well as end-user facing features that include new entities, facts and events, and enhanced social tagging.

The “Semantic Web” refers to meta-information hidden in the page code that is derived from the content itself, with the aim of letting Web services and search engines know exactly what's there without having to guess from keywords and tags. The OpenCalais Web Service creates rich semantic metadata for the content users submit.

Using natural language processing (NLP), machine learning and other methods, OpenCalais analyzes a document and finds the entities within it. OpenCalais also returns the facts and events contained within text, as well as classifying entries. Users are then delivered tags they can incorporate into applications such as search, news aggregation, blogs and catalogs.

OpenCalais Plans ‘Spring Cleaning’

Although we are approaching mid-summer, in a blog posting, Tom Tague, VP of Platform Strategy at Calais parent company Thomson Reuters and in charge of OpenCalais, said OpenCalais will be receiving "spring cleaning and some touchups."

In addition to the previously mentioned pipeline improvements, which Tague said won’t be readily visible to end users but will “set the stage for greater flexibility in the future,” OpenCalais will feature new entities, facts and events related primarily to politics and intra- and international conflict. This means there will be new information in areas such as candidates, party affiliations and arms purchases. Tague also promised enhancements to capabilities for tagging content for social networks. 

Semantic Web is Here to Stay

While OpenCalais has been somewhat quiet for a while, there has been a lot of recent activity on the semantic Web front. In May 2012, Google launched the new Knowledge Graph feature, which is a tweak to its search function that should help users to distinguish, or better clarify, the options available when they perform a search. It links things together, or categorizes them to help users better get to what they're after.

Last month Webnodes, a .NET-based Web CMS platform, released version four of its semantic-based CMS. The Webnodes system builds in semantic relationships right from the dashboard. This makes content more flexible for use across channels and also makes site navigation more intuitive.

And the the BBC's new 2012 Summer Olympics website, which takes advantage of RDF data and a dynamic semantic publishing (DSP) architecture, will use Semantic Web functionality.. BBC used a somewhat similar system for its 2010 World Cup Web site, but now is adding systems by fluid Operations on the content workflow side, and Ontotext on the database side. Ontotext has long been a leader in semantic technology, and is Semtech 2012's biggest sponsor. The 2012 Olympics Web site will use linked data to allow a handful of journalists to populate thousands of Web pages with dynamic content.

So OpenCalais is in some good company in its efforts to expand and enhance the capabilities and reach of semantic Web. This type of sophisticated tagging, classifying and linking of content really isn’t so “semantic” at all.

Calais will run the OpenCalais beta for approximately a month to gather any issues and will roll out into production following that. The upgrade will be 100 percent backwards compatible.