Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!watmath!clyde!caip!topaz!ll-xn!nike!ucbcad!ucbvax!decvax!tektronix!uw-beaver!ssc-vax!bcsaic!michaelm
From: michaelm@bcsaic.UUCP (michael maxwell)
Newsgroups: net.ai
Subject: CYC Project at MCC
Message-ID: <614@bcsaic.UUCP>
Date: Wed, 16-Jul-86 20:27:56 EDT
Article-I.D.: bcsaic.614
Posted: Wed Jul 16 20:27:56 1986
Date-Received: Sat, 19-Jul-86 03:51:16 EDT
Organization: Boeing Computer Services AI Center, Seattle
Lines: 88

(long!)

I just finished reading the following article:

%A Doug Lenat
%A Mayank Prakash
%A Mary Shepherd
%T CYC: Using Common Sense Knowledge to Overcome Brittleness and Knowledge \
Acquisition Bottlenecks
%J The AI Magazine
%V 6
%N 4
%P 65-85
%D 1985
%X MCC's CYC project is the building, over the coming decade, of a large
knowledge base (or KB) of real world facts and heuristics and--as a part of
the KB itself--methods for efficiently reasoning over the KB.

I haven't seen much discussion of this article (there are two letters to the
editor in the next issue, but neither goes into much depth).  It seems to me
that this project has at least the potential of changing the field of AI more
than any other project now in progress (hold the flames, please!).  On the
other hand, it could be a real fiasco; I suspect that we won't know until it's
tried.  But the kind of discussion I'd like to see is not whether CYC will
succeed (for as I say, I don't think anyone can possibly *know* now), but
rather about the methodology--how it could be improved, where the weaknesses
are, i.e. substantive issues, rather than "Gee, this is great!" or flames.
So if you think there are weaknesses, please be explicit, and preferably give
a better way to do it.

By way of starting some discussion (and probably violating my standards I just
gave :-), let me suggest some questions.

1. In their typology of the analogy-space, the list (pg. 69) several ways
in which two frames might be seen to be identical.  "Two frames have
several slots with identical names and values...  Two frames can have
identically named slots whose values are not quite identical...  Both the
names of the slots and the values they contain may match but be
nonidentical [e.g. they might both be specializations of the same slot]."
A lot seems to ride here on identicalness of slot names.  Given that they
expect most entries to be done by copy-and-edit, this is at least
plausible ICO frames that are specializations of the same system (e.g.
"irrigation" and "subway systems" might both be specializations of
"transportation systems").  But does this leave out some of the more
interesting types of analogy?  E.g. the analogy between cable cars (or
whatever it was--some sort of mass transit vehicle) and computer
architecture that is drawn in the movie "TRON"?  I don't think one is a
priori likely to copy the frame for computer hardware from that for
transportation (or am I wrong?)--and if you do, won't you miss other
interesting analogies for computer hardware?  See also their comment (pg. 70)
"that the precise way two concepts are represented can radically effect how
easy it is to find the analogy between them."

2. Re the discussion on endowing an expert system with common sense (pg. 71),
are they doing anything more than assigning a data type to the arguments of a
function?  How does this relate to the idea that typeless programming 
languages have certain advantages?

3. Also on pg. 71, last paragraph, discussing how new frames get added: "the
expert would discover this by arriving at *the* place where Patients should
be...and not finding it there."  [-emphasis mine, MM]  As the database gets
more complex, will it become increasingly difficult to find where a concept
should be?  Probably this violates my standards above--I should wait and see!
See also their comments on pg. 76.

4. It seems to me that the trickiest part is their steps 3 (pg. 80, "extract
and encode the implied (common sense) knowledge" and 4 (pg. 81, "extract and
encode the intersentential knowledge").  Will the encyclopedia articles
really presuppose all the knowledge we would want?  E.g. consider the
interpretation of indefinite NPs in opaque contexts:
	(1) Everyone is looking for a lost boy.
	(2) Everyone is trying to catch a fish.
The most likely interpretation of (1) is that there is a particular boy,
whereas the most likely interpretation of (2) (in the absence of information
to the contrary) is that no one has any particular fish in mind.  What kind of
information might you run across in an encyclopedia that would presuppose this
sort of distinction, and therefore enter it in the presupposed knowledge base?  
Remember that you need to make the distinction not only for boys and fish, but 
answers to arithmetic problems, solutions to Fermat's Last Theorem (of which 
there may be several), etc.  If this particular example is problematical for
CYC, is it an isolated case, or are there lots more?

Well, I have a feeling that I may regret posting this.  Let the flames
come...
-- 
Mike Maxwell
Boeing Artificial Intelligence Center
	...uw-beaver!uw-june!bcsaic!michaelm