/ 20 September 2008

Teaching words to computers

The internet got smarter this week with the release of a semantic map that teaches computers the meanings behind words — and gives the machines a vocabulary far larger than that of a typical American college graduate.

Cognition Technologies began licensing the map on Tuesday to software creators interested in having programs ”understand” words based on tenses and sentence context — in much the same way as the human brain does.

”We have taught the computer virtually all the meanings of words and phrases in the English language,” Cognition chief executive Scott Jarus said.

”This is clearly a building block for Web 3.0, or what is known as the semantic web. It has taken 30 years; it is a labour of love,” Jarus said.

The semantic map is reportedly the world’s largest, and gives computers a vocabulary more than 10 times as extensive as that of a typical US college graduate.

The coming third generation of life online is predicted to feature intuitive artificial intelligence applications that work swiftly across broadband internet connections.

When applied to internet searches, semantic technology delivers results oriented to what people seem to be seeking instead of simply matching words used to online content.

For example, a semantic online search for ”melancholy songs with birds” would know to link sadness in lyrics with various species of birds.

Cognition’s semantic map is already used in LexisNexis Concordance ”e-discovery” software to sift through documents amassed during evidence phases of trials.

”We help them find the needle in a haystack,” Jarus said. ”It used to be boxes and boxes of paper and now 80% of it is digital. Lawyers can search for a smoking gun within that discovery material.”

Cognition’s Caselaw program uses the technology to mine more than a half-century of US federal court decisions for legal precedents, according to the company.

The semantic map is also employed in a widely used medical database.

Cognition says it has also ”semantically enabled” globally popular online encyclopedia Wikipedia.

A Web 3.0 target is to develop artificial intelligence ”agents” that mine mountains of information on the internet for material that suits the interests of the people they serve.

”It would be a software application constantly looking for things you might be interested in while accurately understanding the concepts of what you are looking for,” Jarus said, describing it as ”artificial intelligence agents working for you on a push basis instead of a pull basis”.

Cognition has a handful of rivals, with each firm taking its own approach to semantic technology.

In July, US software giant Microsoft bought San Francisco-based Powerset, a three-year-old start-up that specialises in interpreting the intent of people’s internet searches instead of matching specific words they use.

Microsoft said it plans to use Powerset technology to enhance its free Live Search service, which has been mired in third place behind Google and Yahoo! in the lucrative internet search-related advertising arena.

Powerset’s semantic search merges linguistics with engineering in a software platform to figure out what people are seeking based on questions or phrases.

Standard search engines respond to individual words in the search query.

Microsoft senior vice-president of search, portal and advertising Satya Nadella said at the time that one-third of today’s online searches don’t get people the answers they seek on the first try.

”Search engines don’t understand today that ‘shrub’ and ‘tree’ are similar concepts,” Nadella wrote in a blog posting. ”We don’t understand that ‘cancer’ sometimes refers to a disease and sometimes refers to a horoscope and when a query or a web page refers to which.”

Financial terms of the deal were not disclosed but unconfirmed reports were that Microsoft may have paid as much as $100-million for Powerset. — Sapa-AFP