The ubiquitous keyword mode of Internet search technique is about to be taken over by a new breed of semantic search technology, according to analysts at research firm Ovum. While keyword search remains the most popular method, it is usually not accurate, with users sometimes getting up to 30,000 hits on a search and then having to sift through a list of loosely related keyword results to find relevant documents.
"This where a new breed of so-called semantic technologies comes into the frame. Unlike ranking algorithms such as Google's PageRank for predicting relevancy, semantic search dips into the meaning in language to produce highly relevant search results," according to a report published by Ovum analysts Mike Davis and Madan Sheina.
Notable semantic web providers singled out by the analysts include Expert System, Powerset, Yedda, Trovix and Hakia. According to the authors, awareness of semantic search rose when Microsoft picked up two semantic search companies Powerset and Zoomix.
In the case of Expert System, its application, called Cogito, is designed around the principles of human comprehension to allow content to be understood in the way in which the author intended it to be. This is something that keyword search ignores.
"A Google search for the word 'jaguar' would pull up content around the animal and the car. Semantic search would look not only at the keyword but also other words around it like 'jungle' or 'saloon' to separate the two meanings," said the authors.
Besides semantic search, there are other forms, including heuristics and ontology, linguistics and text mining, and statistical. However, Expert System is claiming that these approaches fall short, addressing only the morphological and grammatical aspects of analysis.
Other search engines often hit a brick wall when it comes to deep analysis. For example, when a heuristically-driven search engine sees two adjectives in a sentence it usually washes them out and scores the sentence as neutral because it has no understanding of where the two separate adjectives are pointing.
In comparison, semantic search looks at both sentence logic - how words in a sentence relate to one another - and semantic analysis - understanding the context of keywords.
When a term is ambiguous, meaning it can have several meanings, for example, bark, semantic analysis is needed on the other words that wrap around it to give it its true meaning and context.
A lexical database
The engineers at Expert Systems are saying Cogito can go the extra mile because it has a semantic network -- a lexical database that provides a knowledge representation of word definitions and their relationships. It poured Webster's dictionary into an in-memory database -- comprising 350,000 words and 2.8 million relationships.
"Expert System's semantic network also focuses on common words. That's different from most ontological approaches that concern themselves with wrapping meaning and context around specialized content, such as scientific terms, and skip common words that comprise 90 per cent of all content," said the Ovum authors.
However, semantic search is still riddled with "a lot of theoretical hype but little real substance or proof that it works better than current search technology.
"Semantic networks are tricky to build and not all are equal. It is unlikely that semantic technologies will ever be able to provide 100 per cent precision in their analysis and results. Moreover there are still question marks over potentially sticky performance issues with semantic searches that eat up more processing cycles."
Join the CIO New Zealand group on LinkedIn. The group is open to CIOs, IT Directors, COOs, CTOs and senior IT managers.