For my 16 year old son Alex, who asked about bots that are
able to think,
This is based on a scientific publication "Symbol Emergence in Robotics: A Survey". This article deals with the ability for a computer to think,
act and communicate as a human. We shall call this a bot going forward.
Since a bot that I can't speak to is a dumb bot, this bot
has to have some understanding of language.
They propose that the bot has some
sort of internal "symbol" recognition system.
What's a symbol, you ask? A symbol is a word that represents
something (a single thing) in the real world. IE: Dog, Cat, People, Sherry,
Jump, Run, Etc...
We humans are very good at stringing these symbols together
to mean something larger. This is called emergence. Simple put emergence is
when simple rules when followed create a idea/action/object that is larger than
itself. Be careful as this is VERY different from traditional logical programming.
But we can dive into that at a different time.
So, if humans are good at stringing symbols together, our
intelligent bot better be too. However there are three almost universally
accepted challenges with this system.
- Some symbols just don't mean anything. This is called grounded symbol problem.
- A symbol means something but it can't be explained by a collection of simpler symbols. This is called the dynamic symbol problem.
- A symbol itself and the way people "interpret" the symbol are different. Usually due to some social aspect. This is called the social symbol problem.
Looking at it a little deeper...
Grounded Symbol Problem
"Feeling", that is a word or a symbol. But what is
it's connection to the real world. There is nothing you or I can
touch/see/taste/hear that defines the word "feeling". It is an
internal symbol that is exclusive to the human brain.
Now, this leads down a very slippery path of what is the
internal representation of the word feelings in our brain. Unfortunately, that
is too large a subject for me to cover. (Also, I don't think I could put it in
layman's terms.)
However, just note that there are words and symbols that
have this issue. And symbols that have this issue are extremely difficult to
fit into our intelligent bot's brain. (more reading: https://en.wikipedia.org/wiki/Symbol_grounding_problem)
Dynamic Symbol Problem
Let's do a simple walk through of the word "Dog",
it's symbols might be "animal, mammal, canine". We step deeper,
what's an "mammal", "organism", what's an
"organism", "cells". What is a cell?
See how we just kept drilling down but we never actually
arrived a meaning for the word "Dog".
This is the issue with dynamic symbols.
FYI: If you google Dynamic Symbol Problem you won't find
much as I made it up. It's a subset of the Grounded Symbol Problem.
Social Symbol Problem
"Screaming"
The girl was screaming in terror.
The car was screaming down the track.
See how the word screaming means two different things
depending on the social context?
This is a difficult to capture in a symbol collection
system.
Ok, so now you know what the issues are how do we proceed to
solve?
First, it's been proposed that we separate our definition of
symbols into two families.
c-symbols, are symbols that are related to computer science
(Think: Things with very exact definition)
m-symbols, are symbols that have meanings (Think: Feelings,
senses and ideas)
Now, forget about c-symbols, I think us programmers have
that covered.
m-symbols are the hard part. Let's say we look at these
symbols and break that into 3 parts
The Sign - This is the word. That actual physical characters
of the symbol. The symbol Dog is made up of 3 characters. D, O, and G. That is
it's sign.
The Object - All symbols have to refer to something. Dog
refers to a four legged animal, at a basic level we connect this symbol with
our mental image of a dog.
The Interpretant - Now this is where things get odd. Ummm, I
have to use some science lingo. Human have 5 senses
(see/touch/hear/feel/taste). ESP might be a sixth but I digress. When we use
one of our senses we call that an input to the brain. So humans have 5 inputs.
With this we are aware of our environment. Ok, lecture done.
Back to the Interpretant, this is when the remembered input
is reconciled with the sign. In other words we connect the sign with some
remembered sense we got from our environment. So the three letter word(symbol)
dog, triggers us to remember a bark, growl, petting, dog walk, etc that makes
us "know" what a dog is.
This translation is called Interpretant (in this paper). I
hope I got that right...I'm not sure that I did.
Interesting Distraction: This is why reading a story can be
so powerful to us. As we read these senses are triggered creating an internal
feeling of what is going on. The more closely we can relate our internal
translations to the words we read the more powerful the story becomes. I think
this is why we love some books and just can’t seem to get into others. Some
stories just don’t trigger those internal translations.
Back to the matter at hand, we each have a different set of
senses that we recall this leads each of us to have a different view of what we
hear/read/see. I think the technical term for this is Umwelt.
Reality encapsulate all of our translations of every symbol
we hear or express. That is a deep thought that should be pondered…deeply.
So let’s bring it all together, we have these symbols, which
are made up of signs, objects and interpretations. There are rules for how
these symbols can be shared with others, this is called a language. This
language emerges from the simple actions of combining symbols together. This is
called the Emergent Symbol System.
The How
The first thing the bot must do is collect a set of inputs
that it can identify as a “thing”. It might be as simple as an image of a black
square versus a white square. The inputs of the two differ enough for it to categorize
these two objects as two different ‘things’.
Notice, it has not assigned a symbol to these two things,
rather it has an internal mechanism to say this black square falls into this category and this white square
falls into this other category.
This is a complex example though as the input to a bot is
all digital, so it really is just categorizing its digital input (senses).
We have solved this problem already there are many algorithms
(programs) that are able to classify input. Search for pattern recognition for
examples.
Let’s call this categorization of the inputs, patterns.
The next step is to connect these patterns with a sign.
Remember a sign is the physical representation of a symbol. “Black Square” =
pattern 1, “White Square” = pattern 2.
There have been a few attempts to create these connections
with written words, LCore by Iwahashi is an example. However I don’t think that
anyone has successfully conquered this problem.
If we solve this problem, the next issue is how to create a
collection of symbols that have some type of actual meaning to another person.
The technical word for this is “Double Articulation”.
From a high level perspective this sums up the difficultly
of creating a bot that can think/act/communicate as a human. That being said
there are a few other considerations, I’ll just list them to be brief.
Time – The ability to establish time sequence events and how
they relate to our symbols and actions. This has been explored but is
presenting issues in the field of A.I.
Mutual Belief System – There is a set of ideas that we all
share. Typically from a common experience that we all have. “This coffee is cold”,
we can easily determine that this means our coffee is not hot enough. But
humans will also perceive that this means we will be getting/needing a different
cup of coffee. Because that is what we all do when we drink cold coffee. This
is a shared belief or implied understanding.
Active Learning and Active Perception – As children we
explore. We actively seek new inputs (senses). We do this as children but our
need to do this lessons as we grow older. Programmatically finding this balance
is challenging.
Syntax – There is a “proper” way to group symbols (words) so
that they make sense. While the bot may understand that the words “dog bites
man” are the correct symbols. It may put together the sentence “man bites dog”.
A totally different meaning. This is syntax, this is difficult to solve.
New Words learned:
Semiotics - The study of symbols and how they play together.
Symbol Emergence in Robotics (SER) - The official term for
all the above, this is a research topic which seems to be spinning off in
Japan.
Epigenetic Robotics - Robotic developement which models how
children learn
Umwelt - German for "Self Centered World", a
particular organisms view of the world.
Emergent System – A complex system
For the Scientific Community
This is my layman interpretation of the following paper:
Symbol Emergence in Robotics: A Survey
Advanced Robotics Vol. 00, No. 00, January 2015, 1–27
Authors: Tadahiro Taniguchi, Takayuki Nagai, Tomoaki
Nakamura, Naoto Iwahashi , Tetsuya Ogata, and Hideki Asoh