Archive for March, 2008

Semantic Search and Semantic Web, Two Different Things

March 29th, 2008 by Dr. Riza C Berkan, CEO

Hello again.

Some bloggers are still mixing things up and putting hakia (and several other startups for that matter) in the wrong buckets. Maybe I should use images to explain since writing about it seems to be inadequate communication for some cases.

Let’s start with the people. The red ones below are linguists by training, greens are tech savvy Web authors, and yellows are Web authors or individual publishers who are afraid of the mouse. The distribution of them on the Web is a big unknown, but for the sake of the ongoing argument, we will grossly approximate it as shown below:


The idea of Semantic Web is that the people (red, green, and yellow) will start publishing Web pages following the rules of semantics and linguistics. They will follow one standard to do that, probably using tools provided by the companies in this line of business. As a result, the Web will become Semantic Web by the people from every corner of the globe, offering unique benefits.


There are some questions that must be asked considering this picture. (1) While only a fraction of the Web pages written today follow the basic W3C standards, how is it possible that people will follow one standard for semantics? (2) how feasible is it to expect this crowd to do the job adequately considering the challenging nature of semantics and linguistics?

Now, the idea of Semantic Search technology is different as shown below. Let’s take hakia, for example, which is located in New York City.


A team of linguists and ontologists at hakia has built the required semantic algorithms and the resources. This means that the rest of the world does not have to worry about following any standards or learning complex nature of this technology. They can just publish in a free manner devoid of rules or standards. The questions to be asked to companies like hakia are (1) how good are the semantic resources built? (2) how versatile is the technology to conduct search and other functions on the Web?

Whatever the questions are, it looks like we are facing the first challenge of identifying the technologies correctly. Let’s also remember that sheer Web connectivity is not semantics as I explained in my earlier blog entry.

But if all this is still confusing, I just want to close with a final remark: hakia is not a Semantic Web company.

hakia at the SES Show

March 25th, 2008 by Melek Pulatkonak, COO

We would like to thank to all who have stopped by our booth at the SES show last week. The interest in the hakia Club, the BETA search engine and our recently announced technology licensing product was overwhelming.

If you missed us at the SES show, you can now:

- join the hakia Club online,

- email bdev@hakia.com to learn more about OntoSem licensing partnership opportunities, and,

- request a one-on-demo with Farrah, our Communications Coordinator. You can meet and contact her in this room: What would you like to see hakia do?

A special thanks goes to all who blogged about us at the SES. Thank you!

About hakia’s Market Positioning

March 19th, 2008 by Dr. Riza C Berkan, CEO

Yesterday, there were several blog postings mentioning our technology licensing initiative like the one in Venturebeat blog. We thank everyone for carrying on our message quite accurately.

However, we need to make one small correction to misunderstandings that appear here and there: Are we a Google killer? Are we competing with Google-esque search engines? Are we a specific application search engine?

hakia is a general purpose “semantic” search engine whereas Google-esque are general purpose “statistical” search engines. As a semantic search engine, hakia is being developed to fulfill different needs of a different type of on-line searchers. These differences refer to potential benefits (yet not fully realized) within the boundaries of “general purpose” utility. If it sounds confusing I would’t blame you.

Let me throw an example. If the user enters the query “benefits of aspirin”, Google-esque search engines will rank results by popular opinion (via link referrals). Popular opions are formed by millions of ordinary people rather than by a small group of the designers of the drug at Bayer. Therefore, there is no alternative view available on the Web today, alternative being a different criteria, different perspective, perhaps the perspective of credibility, freshness, applicability, feasibility, depth, and so forth.

If you were going to take aspirin and wondering about the benefits of it, versus you may have a weak heart, versus you might be a genetics researcher. Again, depending on who you are, the perspective of ranking search results can vary. Using Google-esque search engines, we are always seeing one fixed perspective. hakia is about to enrich this experience.

For the reasons I am trying to explain, hakia’s competitive position is undefined, and hakia’s promise is not built on competing for the same turf with others. Note that other semantic search start-ups are saying similar things, thus there is an independently formed concensus about it: Semantic technologies will bring out something new about the Web that is hard to place in any competitive scale.

If we have to compare ourselves to others, we can proudly say that we have taken the long road of building a sound core-technology (still continuing). Our latest technology licensing initiative is the natural outcome of this process. You might ask what kind of core-technology initiatives are coming out of them. Or does it really matter?

Beyond technology licensing, we will offer semantic technology by various channels ranging from syndication services to APIs. We believe semantic technologies will float underneath every Web interaction, and that’s the focus which defines our current state of mind in the best way.

hakia Starts Technology Licensing, First Partner Riverglass Inc.

March 18th, 2008 by hakia Team

We are proud to announce that the development of our core technology, Ontological Semantics (OntoSem), has now reached a maturation and is ready to be licensed to partners who are developing semantic applications. Our first partner is RiverGlass, Inc. (www.riverglassinc.com), a leading provider of advanced real-time analytics and intelligent Web information collection and analysis solutions. RiverGlass, Inc. will integrate hakia’s OntoSem in its analysis software.

OntoSem technology enables computer algorithms to analyze text and to understand the embedded meaning by producing an event-based TMR(text-meaning-representation). Our Chief Scientific Officer Dr. Christian Hempelmann had posted a TMR example earlier in our blog. An interactive demo is also available at the hakia lab which is restricted to a limited portion of our OntoSem system.

The TMR can be used to deliver high-level, “semantic” relevancy in applications including:

- Categorization (document management)
- Summarization (document management)
- Retrieval (search & advertising)
- Abstraction (contextual advertising, SEO)
- Classification (alerting, information security)
- Clustering (social networking)
- Machine translation

hakia.com BETA search engine will soon offer some of these capabilities on-line, and in the form of Syndication Services.

More information can be found at the hakia club.

What would you like to see hakia do?

March 17th, 2008 by Farrah Hamid, Communications Coordinator

farrah2.jpg Hello everyone! I joined the hakia team a week ago as its Communications Coordinator. It’s truly exciting and energizing to be part of something new and innovative, at its early beginnings.

As a team, we love feedback. I would therefore like to kick-start ongoing feedback sessions by asking our readers this question: What would you like to see hakia do? Meet me in this room to tell us what you think.

Or even better, if you are attending the SES show next week, stop by the hakia booth (# 2200) to say hi and let me know what’s on your mind. You can also check out what we have been up to lately. We look forward to seeing you there.

Connectivity is Not Semantics!

March 14th, 2008 by Dr. Riza C Berkan, CEO

Reading the article on Times Online, March 12, 2008, it starts with the definition “ The semantic web is the term used by the computer and internet industry to describe the next phase of the web’s development, and essentially involves building web-based connectivity into any piece of data – not just a web page – so that it can “communicate” with other information.

Sounds good, but I find it grossly incomplete.

The term “semantic” refers to a cognitive process of understanding. It is a structure starting from data to reach its conceptual representations, then to retrieve related attributes from this representation to lead to a decision. Semantic is not just having connections to various forms of data and establishing lateral communication among them.

The article continues “Mr. Berners-Lee said.. Imagine if two completely separate things – your bank statements and your calendar – spoke the same language and could share information with one another. You could drag one on top of the other and a whole bunch of dots would appear showing you when you spent your money.” What Mr. Berners-Lee is saying can be interpreted in two ways, one of which is the simplistic view:

sem1.gif

There is nothing semantic in this example. Data level communication between various forms of documents can be accomplished by the existing tags, or XML structures. You can go to Amazon.com and see connections between your purchase of a CD to a related book, which would not be much different than the example above.

When the term semantic is used, the heart of the problem is not the connectivity, it is the structure. The same example above should be considered in the following manner.

sem3.gif

The important part is the concept map and lexicon by which a computer algorithm will make decisions for compliance. It is the process of diving into the concept space, and floating back to the surface with the correct sense of the data in hand. Only then your Web connectivity will become a Semantic Web connectivity. Mr. Berners-Lee’s example must be interpreted this way.

Another recent article by techcrunch touches the same topic. While this article acknowledges the semantic process better, it does not question the feasibility of having user communities complying with the difficult rules of semantic structures.

At hakia, we have built concept maps and English lexicon (over 100,000 word senses) with a dedicated team of well-trained linguists and ontologists over 3 years (and still continuing.) You can read about the science of Ontological Semantics and what it takes to do the job. It simply amuses us to find out about optimistic strategies that rely on the end-user to deploy semantic rules or to follow semantic standards. The semantic Web will not emerge out of the end-user or programmer communities, it has to be done by professionals, and delivered to them.

With the increasing emphasis on semantic search in the market, we are entering the phase of short-changed interpretations and incomplete definitions. Our insistence on correcting these views may seem unnecessary to some people. However, the past is full of examples of abusing the terms like natural language processing and artificial intelligence. We don’t want “semantic search” to become another victim of misrepresentations.

What does hakia want to become?

March 13th, 2008 by Melek Pulatkonak, COO


mp.jpg This question pops up in every Q&A session when we present to a business audience. Yesterday was no different. I presented hakia.com at the 2008 Montgomery Technology Conference and someone asked: “What does hakia want to become: an OEM partner or a destination site?”


The answer is both. Our goal is to become a destination site, a search engine that offers quality search. We will also power publishers’ sites with our syndication service. Currently, we are working on pilot projects. Ping us at bdevathakiadotcom if you are interested in running a pilot with us.

The serious discussions on the state of the private equity markets came to a screeching halt in the first evening of the conference when the Beach Boys performed for the attendees. Yes, it was the real Beach Boys. We always contemplate about the future. It was a pleasant change to pause and take a walk in the memory lane by listening to timeless classics like Surfin’ USA.

10 New Search Engines to Watch

March 12th, 2008 by Kartal Guner, Chief Architect

For those who are interested in the companies with new search technologies, we have put up a monitoring page on hakia club that shows their US traffic and rankings. Our selection includes 10 search engines with new algorithms (not human powered) and/or new user interface. Some companies have been around longer than the others, and some are in stealth mode. The page looks like this:


searchengines.gif

The charts come from Compete.com. Note that Compete.com stats can be misleading for low traffic measurements due to sampling errors, and some occasional odd behavior can be observed probably due to Compete.com’s own data upadates. We have also included technorati charts to show the number of blog mentions about these companies.

These companies may not necessarily be in direct competition until their final market positioning becomes obvious.

Join hakia club to view this page and its updates, and for more back-stage information about hakia.

hakia ends Partnership with Ask.com

March 8th, 2008 by hakia Team

Just to set the record, hakia is no longer partnering with Ask.com for on-line advertisements and syndication services. The new partner will be announced soon.

This was a move according to our plan for switching to a more flexible arrangement as hakia’s own semantic advertising system is expected to debut this year. We had a productive one-year relationship with Ask.com, and we thank them for their support and service.

Stay tuned for the upcoming exciting developments at hakia.

hakia Participating in LangTech 2008

March 7th, 2008 by Dr. Christian Hempelmann, Chief Scientific Officer

langtech.gifI recently returned from Rome, where I represented hakia at the LangTech2008 conference. Hans Uszkoreit of the DFKI had invited several search scientists to join a panel on “Human Language Technology in tomorrow’s search application.” The night before the panel, conference participants could take a tour of the impressive Castel St.Angelo, where we enjoyed the view over the city and a great buffet dinner, during which I had a lively conversation with Prof. Uszkoreit and Dr. Lenz.

panel.gifFor the panel the next day, Hans Uszkoreit started us off with a set of questions on the promises and limitations of NLP in search. Next, Geoff Zweig of Microsoft focused on their research on voice search with summarization improvement; Mario Lenz, CTO of empolis (part of avarto, part of Bertelsmann), the leader and coordinator of the EU Theseus project, gave a general overview of NLP support and its limitations; Tom Hofmann, Director of Engineering of Google’s research team in Zurich presented recent advances into NLP support for their search engine; finally, I talked about how all that was yesterday and how real semantics will mean relevance and how statistics will continue to mean staying below the threshold. Most questions during the following half hour of audience questions and panelist discussions were to Tom Hofmann and me, to him how they do what they do, and to me if what we claim we’ll do is really possible.

I’m safely back in our New York office now, with the usual travel-induced cold. But I’m truly glad to have had the opportunity to meet and talk with these colleagues and other participants of the conference and present our approach to the audience at LangTech 2008. I was particularly happy to have also found other researchers who think that statistical and formal methods alone will continue to not deliver acceptable applications in NLP. I discussed this in detail over lunch with Claude Roux of Xerox, Grenoble. Very telling was a remark by a member of the audience after a paper on the semantic web by Christopher Welty of IBM. The remark cited Hubert Dreyfus’s criticism of the claims of progress in standard statistical artificial intelligence: “…the first man to climb a tree could claim tangible progress toward reaching the moon. Rather than climbing blindly, it’s better to look where one is going.” (Dreyfus, Hubert L. 1992. What Computers Still Can’t Do: A Critique of Artificial Reason. Cambridge, MA: MIT Press: page 100.). We know where we want to go with search and we know we need nothing less than real semantics, Ontological Semantics, to get there. Others are still climbing the statistical trees.