Humans communicate ideas, and concepts through words.  As I write this document, I have ideas I am trying to express, and I attempt to choose the words that best communicate what I want to convey to the intended reader.  There are many words we can choose from when expressing our ideas.   Multiple words map to similar concepts, thus our human understanding is highly dependent upon the use and understanding of “conceptual similarity”, or synonymy.   When someone says, “father” we know internally that is very similar to “daddy” or “papa”.   Similarity allows communication to break from “keyword” mode and focus on the intent behind the words being used. Stronger understanding of similarity allows for more accurate and fruitful communication of ideas.  Conceptual similarity not only allows for human communication to continue, it provides needed variation that adds style, dialect and connotation differences.

The current industry attempts at similarity have been lacking.  The industry has tried to account for similarity by utilizing basic thesaurus functionality.  With a database or table, similar words have been hard-coded, or tied, to their near matches.   This “hack” has provided for some variation, but in most cases the results haven’t been more accurate, they have been more muddled.   The dreaded “dump truck of information” strikes again, and the end result has been to push similarity and synonymy to the background for later use.  Without a strong understanding of similarity, Natural Language Processing will never get to where it promises it is going.

True understanding of similarity requires an understanding of the core concepts that embody any given concept.   “To disapprove” means to “consider something bad”.  “Consider” means “to have an opinion”.  Immediately, as humans, we know that when someone considers something bad, they are disapproving of that something.  Furthermore, “to disapprove of something greatly” starts to push towards “despise” or “hate”.   In a conversation, someone may mask their hate by using the softer words “disapprove” or “consider”, but the other words they use with the concept will give away their intent.  Coming out and saying “I despise x” conveys the same concept as beating around the bush with, “I really consider x to be bad”.   What was provided was some added information is on a different meta-level.  The speaker is either unaware of their “hate” or they feel it is unacceptable to express, thus they try and soften their opinion with concepts that are not as strongly connoted with as emotionally negative concepts.

This “hypothetical” concept thesaurus request would be scored and parametrized. A request for a formal thesaurus word of “daddy” would yield “father”. As we saw earlier, a request for a more forceful word for “disapprove” would yield “hate”. This is accomplished dynamically with an understanding of how to compare the two concepts, not by a table or db hack. No KW search would hold a candle to such functionality, and an AGI with this understanding would be one step closer to truly passing the Touring Test.

Leave a comment

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

  • Archives

  • Categories

  • Stats

    • 345 hits
  • Advertisements
%d bloggers like this: