Archive for July, 2008

If Google had Semantic Technology…

July 18th, 2008 by Dr. Riza C Berkan, CEO

I am writing this blog article in response to the mention of hakia in the NYTimes blog article written by Saul Hansell, who is praising Google’s technology in wake of Google’s declining profits and 12% drop in its share price.

Mr. Saul Hansell and I had an exciting conversation about the future of Web search. In comparison to Google, our point was, and always is, that semantic search is not an option, rather it is an irreversable technological reform which is already taking place in multiple dimensions. Mr. Hansell reassured us that Google already has these capabilities in various forms.

If Google had semantic technology, how can anyone explain the inadequacies encountered with the following queries (in comparison to hakia)?

Does insurance cover brain surgery?

Do DOE and EPA have conflicts?

longest strike in the history of railroads

What are the benefits of shifting lanes?

If Google had semantic technology, why don’t we see categorical results showing all the aspects of a short query?

piano, peru, the meaning of life, Mark Rylance, Biodefense and Bioterrorism, Albinism, America’s Got Talent

If Google had semantic technology, it would be able to rank credible sources on top to important queries like this:

what treats headache

Google results, from top to bottom, come from 1- NYTimes (Art Section), 2- NYTimes (Art Section), 3- Revolutionhealth.com (commercial outfit), 4- medscape.com (commercial outfit)… hakia results, in the same order, come from 1- Philedelphia Chronicle (news), 2- Kidshealth.org (recommended by Medical Libraries Association), 3- Mayoclinic.com (recommended by Medical Libraries Association), 4- who.int (World Health Organization), 5- Wikipedia,…

If Google had semantic technology, it would not bring a result like:

The “Weinple bill, authorizing- the appointment of street car’ employes to be special policemen during strikes was killed. In the Senate there was debate on..

to the query:

what bill was killed in the senate

We have no idea how Google’s algorithm works, and it does a great job in so many ways. But, one thing is clear. The results show no sign of systematic performance to understand the meaning of concepts. They don’t show ranking based on quality. They don’t show aspect categorization beyond statistical clustering. They don’t show question type detection.

My conversation with Mr. Hansell reflected our experience with Google, which I outlined a simplified version above.

The small differences in search ability shown here may naturally mount to larger differences in the future as hakia’s semantic technology advances step-by-step. What does this mean to the search business is yet to be seen. But one thing is for sure, you cannot do a patch job to create semantic technology out of a system that indexes keywords and augments it with statistics. Semantic technology has to be built from scratch with the first principles.

hakia + Yahoo! Search BOSS Integration

July 14th, 2008 by hakia Team

In response to the recent announcements, we have received a number of inquiries about the integration of Yahoo! Search BOSS to hakia. While most press articles did not speculate, some blogs had incorrect explanations. To clear the air, we will explain this integration using a simple diagram as shown below.

hakiaYahooIntegration.jpg

First, all yellow boxes indicate various levels of Semantics (Ontological Semantics) rendering into the search results which is continuously increasing as hakia analyzes more Web pages.

hakia’s standard operation is depicted by steps 1, 2, 3, and 4. The back-end operation 5, 6, and 7 fills the result content. In step-5, hakia crawls the Web exhaustively and/or focusing on selected (credible) Websites. In step-6, QDEXing operation (hakia’s proprietary indexing system) takes place with OntoSem rendering.

Yahoo! Search BOSS integration is depicted by the steps 1a and 1b. The main interest in hakia is step-1a where the on-line query produces Yahoo index results, which are then filtered by hakia to identify credible Web pages to crawl for semantic analysis. This is hakia’s “evolution” path where QDEXing follows users’ queries. In one respect, hakia gets better in subjects that are in the interest of its users. Step-1b is also available for image search and backfilling if necessary. Regardless, all results accumulated in the dynamic pool go through hakia’s SemanticRank algorithm, as shown in Step-3.

The rate of QDEXed content assisted by Yahoo! Search BOSS will reach a saturation point in time because most credible pages will have been QDEXed already. Recently, we have deployed a statistical measurement system to measure the overall effect, which will be announced later.

Yahoo! Search BOSS is an opportunity for innovators to focus on what they do the best. We think it is one of the best initiatives launched in the search market in recent years.

hakia Joins Yahoo!’s Search BOSS

July 10th, 2008 by hakia Team

We are pleased to announce our participation in Yahoo!’s Search BOSS (Build Your Own Search Service) today. As part of this initiative, we have access to one of the largest Web directories on the Internet, which accelerates hakia’s QDEXing process and semantic analysis of the Web’s content. QDEXing is a critical element that replaces traditional index to allow scalable semantic search. Without this kind of infrastructure, application of semantic technology is destined to be limited, such as covering Wikipedia only.

The search landscape is currently in a dynamic stage of reinvention. Yahoo! is inviting more innovation to enter the market, while Microsoft validates the importance of semantic search technology with its recent acquisition of Powerset. For the latter, we congratulate both parties, yet are disappointed by the fact that we’ve lost our favorite competitor. From now on, we will look for traces of the Powerset-effect in LiveSearch.

For hakia’s part, we will continue the momentum as we keep up our progress towards coming out of BETA later this year. As we always say, the every day application of semantic technology is an irreversible, long-overdue process. It is coming…

From the Semantics of Humor to Semantic Search

July 1st, 2008 by Dr. Christian Hempelmann, Chief Scientific Officer

kiki3.jpg
Ontological Semantics in our current proprietary version OntoSem 2.3 is hakia’s core natural language technology. It is a theory of language meaning with an interesting pedigree, one of its ancestors being the “Semantic Script Theory of Humor”, developed by Victor Raskin in 1985 in his book Semantic Mechanisms of Humor. While it might sound a little funny, as noted by ZDNet’s Paul Miller with whom I recently talked at the Semantic Technology Conference, the development from a theory of humor to an application in Internet search is actually quite logical.

For a text to be humorous is for it to have a specific meaning. And in order to adequately describe meaning, we need a complex semantic theory including a rich repository of world information, an ontology, structured around “scripts”. These chunks of world knowledge are evoked in a specific constellation in humor. Let’s look at an example:

“Is the doctor at home?” the patient asked in his bronchial whisper.
“No”, the doctor’s young and pretty wife whispered in reply. “Come right in.”

When humans process this text, “doctor” will trigger a general medical script for them, into which the meanings of “patient” and “bronchial whisper” fit nicely. In this medical script, the doctor’s wife will then probably “whisper” to comfort the ailing patient. The additional information that she is “young” and “pretty” seems at least odd, until the punchline tells us that we are not reading about a reverse house visit, but, at least in the mind of the wife, about an adulterous encounter. For humans, just as for computers, to get this joke, they need world knowledge that is structured in such a way that a medical script be in opposition to a non-medical, sexual script: enter OntoSem! For a computer to get the meaning of text on a webpage and match it to a query, it needs just that: A rich ontology that it can use to identify meanings of words in language, no matter if it’s English or Japanese.

Thus, it was quite logical that I represented hakia at the International Summer School for Research in Humor and Laughter in Galati, Romania, last week, where lecturers from around the world introduced participants to existing theories and methods in humor research. My talks included general introductions to (computational) semantics and a dedicated lecture on OntoSem as a tool for humor research as well as the core of an Internet search engine.

Making sense of humor with OntoSem is just a specific version of making sense of language in general, the foundation of hakia’s semantic search.