hakia’s Semantic OntoParser

February 17th, 2008 by Dr. Christian Hempelmann, Chief Scientific Officer

Here, we explain how hakia’s semantic OntoParser takes a sentence, processes it, and produces a text-meaning-representation. This is the essence of making computers understand natural languages by means of ontological semantics resources and parsing algorithms.

Take the simple sentence “The outlaws ran cocaine into the United States.” We (the human brain) can identify the meaning of this sentence easily: Humans who habitually commit illegal acts clandestinely transported a psychoactive drug into a country called the United States. We also infer all kinds of other things from our knowledge of the world: The cocaine probably came from South America, will be sold illegally for profit and consumed by people who will show certain changes in their behavior and emotions (probably pleasant for them, usually unpleasant for those around them) after consuming it, typically by snorting it up their noses, etc.

Let’s see how close to this understanding the computer can get with ontological semantics. First, OntoParser produces all potential senses of the words in the sentence and breaks the sentence up into clauses based on central events that are identified among the senses. The screen shots from OntoParser demo are shown below.

onto1a.gif

“The outlaws” has only one sense, CRIMINAL (note, we use capital letters to indicate that we’re talking about concepts to express a sense, not words in the sentence), but for “run” our system has all of 9 senses, from which it must pick, for example RUN, RUN-FOR-OFFICE, or SMUGGLE; “cocaine” has two related senses as DRUG, as has “United States”, a COUNTRY. With 1 x 9 x 2 x 2, this simple sentence has 36 potential meanings at this stage.

But not all these combinations are possible, CRIMINALs can’t FLOW a DRUG, for example. These are excluded by matching properties of the CONCEPTs in our world model, the ontology. FLOW, for example, allows for no agent, only a theme, and that theme must be a liquid. Neither CRIMINAL nor DRUG is a liquid, and only one of them could be fit into that EVENT anyway. The parser sets the 9 possible EVENTs and tries to fill all the other OBJECT senses in the sentence as participants in the EVENT.

The event SMUGGLE, allows for theme that must be a WEAPON, or ILLEGAL-DRUG, or IMMIGRANT.

onto4.gif

The parser fills all EVENTs with the possible PARTICIPANTs (case roles) from the sentence that it has chosen in the previous step. Then it weights the possible EVENTs and all combinations of their PARTICIPANTs, in terms of how well the PARTICIPANTs fit into the EVENTs.

For most EVENTs, CRIMINAL can fill the agent slot, but the other CONCEPTs fit nowhere; for fewer EVENTS, UNITED-STATES can be fit into theme or location, gaining them a higher score. Even fewer EVENTs can accommodate all three other CONCEPTs (some of them actually wrong). But SMUGGLE wins this race because ILLEGAL-DRUG fits closest to the theme it can take. So finally the parser outputs the text-meaning representations from the top scoring down to the lowest scoring.

onto3a.gif

This capability is the essence of semantic search where the concepts in a given query are matched to the concepts in Web pages. The range of applications that can use this technology includes summarization, categorization, classification, abstraction, machine translation, data mining, and more.

delicious:hakia's Semantic OntoParser  digg:hakia's Semantic OntoParser  furl:hakia's Semantic OntoParser  reddit:hakia's Semantic OntoParser  

8 Responses to “hakia’s Semantic OntoParser”

  1. egorych Says:

    Nice explanation. As far as I understand, search relevance for such an engine is determined by a vocabulary and words’ relations. It must be working for long phrases, but what about short key phrases of 1-2 words? Espessially it touches 1 word expressions: for example Google shows “glossary” information for a short term (1st place in SERPs is wikipedia, etc.) and some other search engines prefere to show shops/markets containing the asked product. How do you solve this uncertainty?

  2. Murat Says:

    So my understanding is that using hakia.com in another language is very hard and will take very long time.

  3. Dr. Riza C Berkan, CEO Says:

    No, Ontology is language independent. Only lexicon translation is necessary. That is one-to-one translation and is readily available.

  4. Murat Says:

    I see, many thanks for replying.

    (Umarım pazarda hak ettiğiniz konuma ulaşırsınız Rıza Bey, sizlerle gurur duyuyoruz :-) )

  5. Jiri Says:

    to Riza C Berkan: I think that “lexicon translation” is not sufficient as long as it is not sufficient for machine translation in general. Your ontology can be language independent but semantic extraction or decoding whatever you called it – it is language dependent, and this part is not just lexicon of words see f.e. idioms like “run” ~ “run-for-office” these relations are language dependent. Anyway I can be wrong – it your case, it should be very easy to port hakia to other language – are you planning anything like that?

3 Trackbacks/Pingbacks

  1. hakia Blog » Blog Archive » hakia Starts Technology Licensing, First Partner Riverglass Inc. Says:

    [...] Our Chief Scientific Officer Dr. Christian Hempelmann had posted a TMR example earlier in our blog. An interactive demo is also available at the hakia lab which is restricted to [...]

  2. hakia Blog » Blog Archive » Bill Gates Speakes Up For Semantics Says:

    [...] If, finally, you have access to semantics, your constraints on the different senses of ‘kill’, ‘light’, and ‘rock’ will get you to the meaning automatically, and you will serve the sentence above only as an answer to queries about methods to switch off lamps, and not pollute your results with it otherwise. For more examples, you can read my prior blog posts. [...]

  3. info-blog » Co to jest: “semantyczna wyszukiwarka”? Says:

    [...] Co to znaczy, że “Hakia rozpoznaje znaczenie” użytego w zapytaniu wyrażenia i rozpoznaje to samo znaczenie w indeksowanych dokumentach? Hakia interpretuje wyrażenia jÄ™zyka naturalnego w ustalonych modelach: mapach pojęć odpowiadajÄ…cych poszczególnym terminom. OczywiÅ›cie do dyspozycji mamy wiÄ™cej niż proste pojÄ™cia stanowiÄ…ce znaczenia prostych terminów: do pojęć dołączone sÄ… różnego rodzaju syntaktyczne i semantyczne warunki okreÅ›lajÄ…ce możliwe relacje miÄ™dzy pojÄ™ciami. Konieczne bÄ™dzie także ustalenie zasad wyboru jednego z wielu możliwych znaczeÅ„ terminów wieloznacznych. Hakia przeprowadza analizÄ™ semantycznÄ… wyrażeÅ„ w sposób na pierwszy rzut oka caÅ‚kiem satysfakcjonujÄ…cy. Ustalamy po prostu statystyczne prawdopodobieÅ„stwo współwystÄ™powania znaczeÅ„ słów skÅ‚adajÄ…cych siÄ™ na zdanie. Znaczenia-pojÄ™cia sÄ… ustalone w sÅ‚owniku, z którego Hakia korzysta. Niektóre konstelacje sÄ… szybko eliminowane na podstawie oceny możliwych kategorialnych dopeÅ‚nieÅ„ okreÅ›lonego pojÄ™cia, niektóre zaÅ› sÄ… oceniane jako mniej lub bardziej prawdopodobne statystycznie. Polecam omówienie przykÅ‚adu analizy na blogu Hakii. [...]

Leave a Reply