GHuRU - Generic Higher Reasoning Utility
http://www.cs.hmc.edu/~dbethune/ghuru/
The world wide web is a huge, grossly disorganized, information and
entertainment resource. Early efforts to manage the whole mess, such
as Yahoo, a searchable,
hierarchical directory of existing web sites, and AltaVista, a huge,
searchable database of pages, have proved themselves invaluable.
Without some sort of organization, the usefulness of the World Wide
Web would have been quickly overwhelmed by the sheer bulk of
information available. The problem (as well as much of the power)
with existing organizational techniques, however, is their
simplicity. Queries must generally be entered in a logical form that
is often confusing to many people, and the results returned are not
generally just quite what you are looking for, especially for more
complex searches.
This is where this system comes in. Wouldn't it be neat if you could
go to some web site, type in a question (as broad or as specific as
you choose), and get an answer back, both the answer and the question
in common English? That's what GHuRU will attempt to do. I will
outline the basic design for GHuRU, including a couple of alternative
methodologies. I'll also go into some detail about the components
involved in the system and outline some problems that would come up
during implementation.
||
General Overview ||
Implementation Concerns ||
Possible Offshoots ||
General Overview
The basic system is pretty simple. It consists of a search engine
(multiple search engines can also be supported), a natural language
processing unit to convert between natural language (English in this
case) and a logical language form that I am calling Reduced Conceptual
Form (RCF), and a Higher Reasoning Unit (HRU)
that can accept RCF as
input and create Highly Reduced Conceptual Form (HRCF) as output which
is then converted back into our language of choice. There is also an
intermediate step bridging the Search Engine's output to the rest of
the system. This step is where information is weighted based on its
reliability and also actually retrieved from the World Wide Web.
||
RCF ||
HRU ||
NLP ||
Search Engine ||
In this design, I will assume the existence of a perfect Natural
Language Processer compatible with RCF, but I will make some attempt
to outline the requirements for a strict definition of RCF. I will
use AltaVista as an example search engine while describing the issues
involved with the generation of reliability weights. Finally, I will
discuss the HRU in some detail and try to describe how it might
function.
I can currently envision two different forms that GHuRU could take,
one having an integrated search engine that was constantly looking for
new information and higher concepts (by resolving independent lower
concepts with each other), and building its knowledge base. The other
would be more of a research tool that would try to answer questions
that you pose to it by searching available engines, but never trying
to go beyond that. With the
first model, GHuRU would try to answer new questions. Following the
old idea that every answer opens five new questions, GHuRU would in
time, learn everything that was available to be learned. With the
web as the source, this learning process could go on for quite some time.
Implementation Concerns
The biggest concern with any implementation is feasibility. Given the
current state of computer technology, is this a worthwhile project? I
think that the potential for a system like this makes its pursuit worthwhile
even if a full large-scale implementation might not be at this time.
I think that a largeish system (something akin to what AltaVista uses)
could handle queries at an acceptable pace. If the search engine
component were not integrated into the system, as would almost surely
be the case in an early prototype, the network lag would necessitate a
relatively large amount of time to answer a query. I would envision a
scenario where you would ask a question and then have the answer
emailed to you in an hour or so. Of course, this wouldn't suit the
needs of the average web surfer, but once a prototype is available,
all the pieces would be in place for further development.
Possible Offshoots
Any good idea makes you think of several more good ideas that are
slight variations. A intranet version of this system could certainly
be used to great effect, operating on a finite (is the web
finite?) supply of organized (hopefully) information. In a case
such as that, the real win over traditional database systems would be
the easy to use interface, as well as the concept search idea as
opposed to a simple keyword search.
If coupled with a translation system, GHuRU could also accept and
answer questions in any language, drawing its information from data
sources all over the world in multiple languages.
||
RCF ||
HRU ||
NLP ||
Search Engine ||
questions or comments should be sent to dbethune@hmc.edu