GHuRU - Reduced Conceptual Form (RCF)
http://www.cs.hmc.edu/~dbethune/ghuru/RCF.html
||
GHuRU ||
HRU ||
NLP ||
Search Engine ||
To exhibit any level of comprehension at all, a computer must be able
to build grammar trees from the available text. This is a process
where the individual words are analyzed independently (to
determine their part of speech), and then strung together to create
grammatical units (phrases), and eventually sentences. This process
is analagous to the way you and I learn to understand written and
spoken English by gradually piecing small sections together to make
larger ones.
To then actually understand what is being said enough to logically
respond to it is another step that requires some sort of vocabulary,
a set of words that are "known" by the computer. At the most basic
level, these words are known by their part of speech. At a slightly
higher level, you can try to figure out what it actually being
communicated by developing a list of relationships between the words.
Let me try an example.
Given the sentence ... The purple dog swallows a cat.
You would start to list the parts of speech.
- det(the)
- adj(purple)
- noun(dog)
- verb(swallows)
- det(a)
- noun(cat)
Then you would start developing dependencies.
- purple(dog)
- a(cat)
- the(purple(dog))
- swallows(the(purple(dog)), a(cat))
The last construction represents the entire sentence, and indeed what
thought is being communicated. By recursively going through this
process. These thoughts, or concepts can be built up. By also being
able to understand what it means to define (or equate) something,
would allow GHuRU to also build its vocabulary, replacing those
cumbersome conceptual constructions with new words (for instance
continually(thinks(Bob, Mary)) could be replaced with loves(Bob, Mary)
).
Most natural language systems being worked on today work on some sort
of concept like this. The problem I see is that each individual
project seems to be working on their own versions of computerese. It
did seem like some of the work on Multilingua, an intermediate
language being used for translation may have been in the direction of
standardization, though. I think that it is where the work should
go. If a common language form could be developed (and continually
redeveloped), that applications could be developed to support, then a
lot of redundant effort could be avoided.
My idea of this standard language is called Reduced Conceptual Form.
This name points out the fact that this language should incorporate some
compression by tossing out unneeded words, and also be able to make
the natural language more understandable by building word and phrase
dependencies in a logical manner (unlike most natural languages). A
definition of RCF would need to include a set of rules covering what
syntz should be used for all possible dependency types. This is a
major task, and the specification would surely have to be revised many
times.
A simple version could be built off of the small example I have shown above.
||
GHuRU ||
HRU ||
NLP ||
Search Engine ||
questions or comments should be sent to dbethune@hmc.edu