If Google had Semantic Technology…

July 18th, 2008 by Dr. Riza C Berkan, CEO

I am writing this blog article in response to the mention of hakia in the NYTimes blog article written by Saul Hansell, who is praising Google’s technology in wake of Google’s declining profits and 12% drop in its share price.

Mr. Saul Hansell and I had an exciting conversation about the future of Web search. In comparison to Google, our point was, and always is, that semantic search is not an option, rather it is an irreversable technological reform which is already taking place in multiple dimensions. Mr. Hansell reassured us that Google already has these capabilities in various forms.

If Google had semantic technology, how can anyone explain the inadequacies encountered with the following queries (in comparison to hakia)?

Does insurance cover brain surgery?

Do DOE and EPA have conflicts?

longest strike in the history of railroads

What are the benefits of shifting lanes?

If Google had semantic technology, why don’t we see categorical results showing all the aspects of a short query?

piano, peru, the meaning of life, Mark Rylance, Biodefense and Bioterrorism, Albinism, America’s Got Talent

If Google had semantic technology, it would be able to rank credible sources on top to important queries like this:

what treats headache

Google results, from top to bottom, come from 1- NYTimes (Art Section), 2- NYTimes (Art Section), 3- Revolutionhealth.com (commercial outfit), 4- medscape.com (commercial outfit)… hakia results, in the same order, come from 1- Philedelphia Chronicle (news), 2- Kidshealth.org (recommended by Medical Libraries Association), 3- Mayoclinic.com (recommended by Medical Libraries Association), 4- who.int (World Health Organization), 5- Wikipedia,…

If Google had semantic technology, it would not bring a result like:

The “Weinple bill, authorizing- the appointment of street car’ employes to be special policemen during strikes was killed. In the Senate there was debate on..

to the query:

what bill was killed in the senate

We have no idea how Google’s algorithm works, and it does a great job in so many ways. But, one thing is clear. The results show no sign of systematic performance to understand the meaning of concepts. They don’t show ranking based on quality. They don’t show aspect categorization beyond statistical clustering. They don’t show question type detection.

My conversation with Mr. Hansell reflected our experience with Google, which I outlined a simplified version above.

The small differences in search ability shown here may naturally mount to larger differences in the future as hakia’s semantic technology advances step-by-step. What does this mean to the search business is yet to be seen. But one thing is for sure, you cannot do a patch job to create semantic technology out of a system that indexes keywords and augments it with statistics. Semantic technology has to be built from scratch with the first principles.

delicious:If Google had Semantic Technology...  digg:If Google had Semantic Technology...  furl:If Google had Semantic Technology...  reddit:If Google had Semantic Technology...  

26 Responses to “If Google had Semantic Technology…”

  1. Shiva Says:

    I have been reading the way this argument has been panning out. The point is that Google has been able to extract semantic information from their statistical data and using it to present information. Although hakia technology might be based on first principles or be more cool, it remains to be seen whether it will be as scalable, practical and efficient on the entire web, especially when used by witless hordes on the web

  2. Dr. Riza C Berkan, CEO Says:

    Semantic information cannot be extracted from statistical data. Let’s say you have a huge list of all the queries asked. All you can do is to get the frequency of occurence of the words or phrases. This is still statistics. This is not semantics. Let’s say you have a list of queries versus links clicked, and you can cluster which query must show which links. Again, this is still statistics. You cannot make a statictics engine to create meaning. All you can do is to extract what is popular. Let’s not underestimate how our brain works. If you read a text that you have not seen before in your life, are you going to fail to understand the text? If I say “romans did not wear taxidos because they did not have proms” do you understand the meaning of this sentence, or do you need statistical sampling before? I am almost sure you have not seen these words together (and the sentence) in your lfie :-)

  3. Blackhatseo Says:

    I think you should post more often, I have enjoyed this so far. Added to my reader. SusanO

  4. Import from China Says:

    I came across this blog the other day and you got some great info here – thanks.

  5. CJ Says:

    Do you natural language queries? Semantics are used in almost every single natural language processing system for IR, machine translation, natural language understanding….

    “But, one thing is clear. The results show no sign of systematic performance to understand the meaning of concepts. They don’t show ranking based on quality. They don’t show aspect categorization beyond statistical clustering. They don’t show question type detection.”

    Well because Google isn’t dealing with natural language queries at this time, but rather keywords, i think you would have more luck with a keyword search. Of course the results are ranked using statistical methods, but not only those. I agree that the results aren’t always amazing, and sometimes the ranking isn’t quite what I’m personally looking for, but it’s a little unfair to evaluate based on natural language queries when it’s a not a Q&A system as such, it’s an IR system.

  6. Anonymous Says:

    Dear CJ

    You sound like there is a magic wand to declare what is a keyword or not. The examples I showed above, like piano or Albisinm, can be called “keywords.” Same with “what treats headache”. Drop the word “what” you have two keywords standing next to each other. Neighter Google nor hakia will assume the query to be “keywords” or natural language query. They are, in any intellectual imagination, the same thing. They are queries created by the human brain. What is different is the processing of the query. Google’s algorithm versus hakia’s. Former goes for statistical compliance, the latter goes for semantic compliance. Comparisons are not apples and oranges from the QUERY perspective.

  7. T Benson Says:

    “Semantic technology has to be built from scratch with the first principles.”

    Well said, Dr. Berkan! However, what we at Cognition Technologies don’t see anyone talking about is what we believe is the key to making Natural Language Processing successful – the Semantic Map.

    To advance NLP beyond its current limitations is a very difficult problem because you have to unravel the complexity of language. In particular, individual words have multiple meanings, and at the same time, a given concept or meaning can be referred to by multiple words. To understand which meaning of a word is correct requires an understanding of context, but in order to fully understand context requires that words and meanings be analyzed or “curated” one at a time into a vast Semantic Map.

    Cognition’s Semantic Map of the English language, which has been built over the past 23 years, is complete and robust.

    Utilizing a Semantic Map differs from Google’s algorithms in that Google relies primarily on string pattern-matching, and does not try to interpret the meaning of the words within context. With a Semantic Map, it is possible to determine word meanings and therefore understand meanings rather than simple string pattern recognition. For example, in Wikipedia, in response to a query “strike oil in California”, Google returns documents about “strike commander” (a flight simulator), striking workers, hunger strikes, etc., in their top 5 results. With a Semantic Map, the most precise and complete retrievals about discovering oil in California are returned, because the meaning of “strike” in context is interpreted as “discover”, as opposed to “attack militarily” or “walk out”.

    Great discussions going on here. Thanks for taking the time to respond to Mr. Hansell.

  8. Lindsay Hogan Says:

    Well mike great post ! I am agree with Dr. Riza that Semantic information cannot be extracted from statistical data. Well thanks again Dr. Riza I am stumbling it here http://lindsayhogan.stumbleupon.com/ so that my whole seo group can share this post with me.

  9. seo blog Says:

    That was a well written article, very intereting,thank you for a good read.

  10. Kai Mai Says:

    I tried to search “apple”(http://club.hakia.com/challenge/default2.aspx?q=apple).
    But Hakia doesn’t show results categorized by computer or food.
    I was expecting Hakia to ask me to narrow down what I actually mean.

16 Trackbacks/Pingbacks

  1. Conspirama Says:

    If Google had Semantic Technology……

    I am writing this blog article in response to the mention of hakia in the NYTimes blog article written by Saul Hansell, who is praising Google’s technology in wake of Google’s declining profits and 12% drop in its share price. ……

  2. Tech Bread » EC goes for Intel again, just like last year Says:

    [...] If Google had SemanticTechnology [...]

  3. SEM News: SearchCap: The Day In Search, July 18, 2008 - Search Engine Marketing Says:

    [...] If Google had Semantic Technology, Hakia Blog [...]

  4. Everybody’s Tech » System Design Tools Says:

    [...] If Google had SemanticTechnology [...]

  5. Tech Made Easy » BBC’s Huggers Confirmed Future Media&TechnologyDirector Says:

    [...] If Google had SemanticTechnology [...]

  6. Everybody’s Tech » Network Administrator - Information andTechnology Says:

    [...] If Google had SemanticTechnology [...]

  7. Tech Made Easy » UsefulTechnologyfor the Farm Says:

    [...] If Google had SemanticTechnology [...]

  8. Search Engine Optimization » Blog Archive » SearchCap: The Day In Search, July 18, 2008 Says:

    [...] If Google had Semantic Technology, Hakia Blog [...]

  9. Tech Made Easy » If Google had SemanticTechnology Says:

    [...] If Google had SemanticTechnology [...]

  10. Tech Advice Guy » IsTechnologyImportant for SOA Governance? Says:

    [...] If Google had SemanticTechnology [...]

  11. Tech Bread » If Google had SemanticTechnology Says:

    [...] If Google had SemanticTechnology [...]

  12. Learn Tech Today » Building Findable Websites-Interview with Aaron Walter, Author… Says:

    [...] If Google had SemanticTechnology [...]

  13. Tech Made Easy » Ze Frank interviewed on The Sound of Young America Says:

    [...] If Google had SemanticTechnology [...]

  14. Tech Bread » Cellity Communicator Updated With External Email, Twitter Support Says:

    [...] If Google had SemanticTechnology [...]

  15. Everybody’s Tech » If Google had SemanticTechnology Says:

    [...] If Google had SemanticTechnology [...]

  16. SEO & SEM Feed Aggregator » SearchCap: The Day In Search, July 18, 2008 Says:

    [...] If Google had Semantic Technology, Hakia Blog [...]

Leave a Reply