I have never met Peter Norvig, Google’s director of research, except being in the same article that was published yesterday in Forbes. Our opposing views came together, which prompted me to write this blog entry with somewhat humorous sentiment.
First, I raise my hat to Peter Norvig for his analogy: semantic search = anti-gravity machine. According to the article, he said that semantic technology experiments he conducted in 1978 yielded results like a dancing bear. He seems to have subscribed to the idea that the near future of search will not go beyond being a slot machine as Chris Sherman describes today’s expectation. Having stated his views about the feasibility of semantic search, Dr. Norvig adds that Google is actually working on it behind closed doors (just in case!) Is it only me detecting lack of guidance and inspiration? Most likely, there is more to it than what is publicized.
We respect and admire Google (how can you not?) for its simplicity and performance. However, this is 2008, and bear the-semantic-search is not only dancing tango today, it is about to get into ice-dancing pretty soon. When it debuts with full force, anti-gravity machines will start to lift old expectations along with some peoples’ hats (if they are not glued to their heads).
What is puzzling to me is the persistent avoidance by many established search technologists of the question “what is beyond statistics?” What do you do when you have a good quality Web document that is not statistically sampled? This question will get more serious when we consider dynamic Web content (something increasing every day) where statistical sampling cannot mature before the content is outdated. Shortly, the long-tail. We have never heard of an answer so far.
Let me put it in another way. Do you chose your doctor, lawyer, spouse, religion, financial advisor, retirement plan, political hero, or baseball team by statistics? Or do you have your personal views? For the latter, you need to go beyond popular view. That is where semantics start.
For those who want to see the glimpses of bear the-semantic-search doing tango moves, I have listed a few queries below in a side-by-side comparison of hakia with Google, Yahoo, and MSN. Sorry that you have to sign up to access these pages, and to continue with your own experiments.
What proteins are highly expressed in the lungs?
Who is the best plumber in new york city?
What are the elemental forms of carbon?
What drug treats urinary tract infection?
These queries are only scratching the surface of the long-tail. Even with short and popular queries, you can see some nice moves with semantic search:
asthma
toyota
jamaica
These examples are here to show the signs of what is coming, and should naturally spark a question why Google cannot handle them: rank the most relevant and transparent result at the top, or have the most intuitive categorization of simple search terms. As I said, what you are seeing is only the beginning of an ultimately different search experience.
We have an archive of search comparisons not to fool ourselves after Google’s bad results are mysteriously corrected following public examples of this sort. It happened in the past, probably by accident! If it happens again, we will display the corrections here.
The question of semantic search is not an IF question, it is a WHEN question.
If the alarm bells are not ringing for “search” itself with the hope of a “longer WHEN”, it should be ringing for advertising. Search and on-line advertising are twin sisters, and the latter does not take jokes well.