Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!watmath!clyde!caip!topaz!ll-xn!nike!ucbcad!ucbvax!decvax!tektronix!uw-beaver!ssc-vax!bcsaic!michaelm From: michaelm@bcsaic.UUCP (michael maxwell) Newsgroups: net.ai Subject: CYC Project at MCC Message-ID: <614@bcsaic.UUCP> Date: Wed, 16-Jul-86 20:27:56 EDT Article-I.D.: bcsaic.614 Posted: Wed Jul 16 20:27:56 1986 Date-Received: Sat, 19-Jul-86 03:51:16 EDT Organization: Boeing Computer Services AI Center, Seattle Lines: 88 (long!) I just finished reading the following article: %A Doug Lenat %A Mayank Prakash %A Mary Shepherd %T CYC: Using Common Sense Knowledge to Overcome Brittleness and Knowledge \ Acquisition Bottlenecks %J The AI Magazine %V 6 %N 4 %P 65-85 %D 1985 %X MCC's CYC project is the building, over the coming decade, of a large knowledge base (or KB) of real world facts and heuristics and--as a part of the KB itself--methods for efficiently reasoning over the KB. I haven't seen much discussion of this article (there are two letters to the editor in the next issue, but neither goes into much depth). It seems to me that this project has at least the potential of changing the field of AI more than any other project now in progress (hold the flames, please!). On the other hand, it could be a real fiasco; I suspect that we won't know until it's tried. But the kind of discussion I'd like to see is not whether CYC will succeed (for as I say, I don't think anyone can possibly *know* now), but rather about the methodology--how it could be improved, where the weaknesses are, i.e. substantive issues, rather than "Gee, this is great!" or flames. So if you think there are weaknesses, please be explicit, and preferably give a better way to do it. By way of starting some discussion (and probably violating my standards I just gave :-), let me suggest some questions. 1. In their typology of the analogy-space, the list (pg. 69) several ways in which two frames might be seen to be identical. "Two frames have several slots with identical names and values... Two frames can have identically named slots whose values are not quite identical... Both the names of the slots and the values they contain may match but be nonidentical [e.g. they might both be specializations of the same slot]." A lot seems to ride here on identicalness of slot names. Given that they expect most entries to be done by copy-and-edit, this is at least plausible ICO frames that are specializations of the same system (e.g. "irrigation" and "subway systems" might both be specializations of "transportation systems"). But does this leave out some of the more interesting types of analogy? E.g. the analogy between cable cars (or whatever it was--some sort of mass transit vehicle) and computer architecture that is drawn in the movie "TRON"? I don't think one is a priori likely to copy the frame for computer hardware from that for transportation (or am I wrong?)--and if you do, won't you miss other interesting analogies for computer hardware? See also their comment (pg. 70) "that the precise way two concepts are represented can radically effect how easy it is to find the analogy between them." 2. Re the discussion on endowing an expert system with common sense (pg. 71), are they doing anything more than assigning a data type to the arguments of a function? How does this relate to the idea that typeless programming languages have certain advantages? 3. Also on pg. 71, last paragraph, discussing how new frames get added: "the expert would discover this by arriving at *the* place where Patients should be...and not finding it there." [-emphasis mine, MM] As the database gets more complex, will it become increasingly difficult to find where a concept should be? Probably this violates my standards above--I should wait and see! See also their comments on pg. 76. 4. It seems to me that the trickiest part is their steps 3 (pg. 80, "extract and encode the implied (common sense) knowledge" and 4 (pg. 81, "extract and encode the intersentential knowledge"). Will the encyclopedia articles really presuppose all the knowledge we would want? E.g. consider the interpretation of indefinite NPs in opaque contexts: (1) Everyone is looking for a lost boy. (2) Everyone is trying to catch a fish. The most likely interpretation of (1) is that there is a particular boy, whereas the most likely interpretation of (2) (in the absence of information to the contrary) is that no one has any particular fish in mind. What kind of information might you run across in an encyclopedia that would presuppose this sort of distinction, and therefore enter it in the presupposed knowledge base? Remember that you need to make the distinction not only for boys and fish, but answers to arithmetic problems, solutions to Fermat's Last Theorem (of which there may be several), etc. If this particular example is problematical for CYC, is it an isolated case, or are there lots more? Well, I have a feeling that I may regret posting this. Let the flames come... -- Mike Maxwell Boeing Artificial Intelligence Center ...uw-beaver!uw-june!bcsaic!michaelm