Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!utgpu!water!watmath!clyde!rutgers!princeton!mind!harnad From: harnad@mind.UUCP Newsgroups: comp.ai,comp.cog-eng Subject: Re: The symbol grounding problem Message-ID: <812@mind.UUCP> Date: Sun, 7-Jun-87 14:25:00 EDT Article-I.D.: mind.812 Posted: Sun Jun 7 14:25:00 1987 Date-Received: Sun, 7-Jun-87 23:39:31 EDT References: <764@mind.UUCP> <768@mind.UUCP> <770@mind.UUCP> <6174@diamond.BBN.COM> <6358@diamond.BBN.COM> Organization: Cognitive Science, Princeton University Lines: 221 Keywords: icons, categories, symbols, grounding, modularity, cognition Xref: utgpu comp.ai:457 comp.cog-eng:103 Summary: The A/D distinction vs. the Symbolic/Nonsymbolic Distinction aweinste@Diamond.BBN.COM (Anders Weinstein) of BBN Laboratories, Inc., Cambridge, MA writes: > [regarding invertibility, information preservation and the A/D > distinction]: what I think is interesting is not preserving the > signal itself but rather the *information* that the signal carries. > In this sense, an analog signal conveys only a finite amount of > information and it can in fact be converted to digital form and back > to analog *without* any loss. This is an important point and concerns a matter that is at the heart of the symbolic/nonsymbolic issue: What you're saying is appropriate for ordinary communication theory and communication-theoretic applications such as radio signals, telegraph, radar CDs, etc. In all these cases the signal is simply a carrier that encodes information which is subsequently decoded at the receiving end. But in the case of human cognition this communication-theoretic model -- of signals carrying messages that are encoded/decoded on either end -- may not be appropriate. (Formal information theory has always had difficulties with "content" or "meaning." This has often been pointed out, and I take this to be symptomatic of the fact that it's missing something as a candidate model for cognitive "information processing.") Note that the communication-theoretic, signal-analytic view has a kind of built-in bias toward digital coding, since it's the "message" and not the "medium" that matters. But what if -- in cognition -- the medium *is* the message? This may well be the case in iconic processing (and the performances that it subserves, such as discrimination, similarity judgment, matching, short-term memory, mental rotation, etc.): It may be the structure or "shape" of the physical signal (the stimulus) itself that matters, not some secondary information or message it carries in coded form. Hence the processing may have to be structure- or shape-preserving in the physical analog sense I've tried to capture with the criterion of invertibiliy. > a *digitized* image -- in your terms... is "analog" in the > information it preserves and not in the information lost. This > seems to me to be a very unhappy choice of terminology! Both analog > and digitizing transformations must preserve *some* information. > If all you're *really* interested in is the quality of being > (naturally) information-preserving (i.e. physically invertible), > than I'd strongly recommend you just use one of these terms and drop > the misleading use of "analog", "iconic", and "digital". I'm not at all convinced yet that the sense of iconic and analog that I am referring to is unrelated to the signal-analytic A/D distinction, although I've noted that it may turn out, on sufficient analysis, to be an independent distinction. For the time being, I've acknowledged that my invertibility criterion is, if not necessarily unhappy, somewhat surprising in its implications, for it implies (1) that being analog may be a matter of degree (i.e., degree of invertibility) and (2) even a classical digital system must be regarded as analog to a degree if one is considering a larger "dedicated" system of which it is a hard-wired (i.e., causally connected) component rather than an independent (human-interpretation-mediated) module. Let me repeat, though, that it could turn out that, despite some suggestive similarities, these considerations are not pertinent to the A/D distinction but, say, to the symbolic/nonsymbolic distinction -- and even that only in the special context of cognitive modeling rather than signal analysis or artificial intelligence in general. > With regard to [the] "symbol grounding problem": I think it's been > well-understood for some time that causal interaction with the world > is a necessary requirement for artificial intelligence... > The philosophical rationale for this requirement is the fact that > some causal "grounding" is needed in order to determine a semantic > interpretation... But although everyone agrees that *some* kind of > causal grounding is necessary for intentionality, it's notoriously > difficult to explain exactly what sort it must be. And although the > information-preserving transformations you discuss may play some role > here, I really don't see how this challenges the premises of symbolic > AI in the way you seem to think it does. As far as I know, there have so far been only two candidate proposals to overcome the symbol grounding problem WITHOUT resorting to the kind of hybrid proposal I advocate (i.e., without giving up purely symbolic top-down modules): One proposal, as you note, is that a pure symbol-manipulating system can be "grounded" by merely hooking it up causally in the "right way" to the outside world with simple (modular) transducers and effectors. I have conjectured that this strategy will not work in cognitive modeling (and I have given my supporting arguments elsewhere: "Minds, Machines and Searle"). The strategy may work in AI and conventional robotics and vision, but that is because these fields *do not have a grounding problem*! They're only trying to generate intelligent *pieces* of performance, not to model the mind in *all* its performance capacity. Only cognitive modeling has a symbol grounding problem. The second nonhybrid way to try to ground a purely symbolic system in real-world objects is by cryptology. Human beings, knowing already at least one grounded language and its relation to the world, can infer the meanings of a second one [e.g., ancient cuneiform] by using its internal formal structure plus what they already know: Since the symbol permutations and combinations of the unknown system (i.e., its syntactic rules) are constrained to yield a semantically interpretatable system, sometimes the semantics can be reliably and uniquely decoded this way (despite Quine's claims about the indeterminacy of radical translation). It is obvious, however, that such a "grounding" would be derivative, and would depend entirely on the groundedness of the original grounded symbol system. (This is equivalent to Searle's "intrinsic" vs. "derived intentionality.") And *that* grounding problem remains to be solved in an autonomous cognitive model. My own hybrid approach is simply to bite the bullet and give up on the hope of an autonomous symbolic level, the hope on which AI and symbolic functionalism had relied in their attempt to capture mental function. Although you can get a lot of clever performance by building in purely symbolic "knowledge," and although it had seemed so promising that symbol-strings could be interpreted as thoughts, beliefs, and mental propositions, I have argued that a mere extension of this modular "top-down" approach, hooking up eventually with peripheral modules, simply won't succeed in the long run (i.e., as we attempt to approach an asymptote of total human performance capacity, or what I've called the "Total Turing Test") because of the grounding problem and the nonviability of the two "solutions" sketched above (i.e., simple peripheral hook-ups and/or mediating human cryptology). Instead, I have described a nonmodular hybrid representational system in which symbolic representations are grounded bottom-up in nonsymbolic ones (iconic and categorical). Although there is a symbolic level in such a system, it is not quite the autonomous all-purpose level of symbolic AI. It trades its autonomy for its groundedness. > [W]hy must the arrangement you envision be "nonmodular"? A system > may contain analog and digital subsystems and still be modular if > the subsystems interact solely via well-defined inputs and outputs. I'll try to explain why I believe that a successful mind-model (one able to pass the Total Turing Test) is unlikely to consist merely of a pure symbol-manipulative module connected to input/output modules. A pure top-down symbol system just consists of physically implemented symbol manipulations. You yourself describe a typical example of ungroundedness (from Georges Rey): > it's possible that a program for playing chess could, > when compiled, be *identical* to one used to plot > strategy in the Six Day War. If you look only at the > formal symbol manipulations, you can't distinguish between > the two interpretations; it's only by virtue of the causal > relations between the symbols and the world that the symbols > could have one meaning rather than another. Now consider two cases of "fixing" the symbol interpretations by grounding the causal relations between the symbols and the world. In (1) a "toy" case -- a circumscribed little chunk of performance such as chess-playing or war-games -- the right causal connections could be wired according to the human encryption/decryption scheme: Inputs and outputs could be wired into their appropriate symbolic descriptions. There is no problem here, because the toy problems are themselves modular, and we know all the ins and outs. But none but the most diehard symbolic functionalist would want to argue that such a simple toy model was "thinking," or even doing anything remotely like what we do when we accomplish the same performance. The reason is that we are capable of doing *so much more* -- and not by an assemblage of endless independent modules of essentially the same sort as these toy models, but by some sort of (2) integrated internal system. Could that "total" system be just an oversized toy model -- a symbol system with its interpretations "fixed" by a means analogous to these toy cases? I am conjecturing that it is not. Toy models don't think. Their internal symbols really *are* meaningless, and hence setting them in the service of generating a toy performance just involves hard-wiring our intended interpretations of its symbols into a suitable dedicated system. Total (human-capacity-sized) models, on the other hand, will, one hopes, think, and hence the intended interpretations of their symbols will have to be intrinsic in some deeper way than the analogy with the toy model would suggest, at least so I think. This is my proposed "nonmodular" candidate: Every formal symbol system has both primitive atomic symbols and composite symbol-strings consisting of ruleful combinations of the atoms. Both the atoms and the combinations are semantically interpretable, but from the standpoint of the formal syntactic rules governing the symbol manipulations, the atoms could just as well have been undefined or meaningless. I hypothesize that the primitive symbols of a nonmodular cognitive symbol system are actually the (arbitrary) labels of object categories, and that these labels are reliably assigned to their referents by a nonsymbolic representational system consisting of (i) iconic (invertible, one-to-one) transformations of the sensory surface and (ii) categorical (many-to-few) representations that preserve only the features that suffice to reliably categorize and label sensory projections of the objects in question. Hence, rather than being primitive and undefined, and hence independent of interpretation, I suggest that the atoms of cognitive symbol systems are grounded, bottom-up, in such a categorization mechanism. The higher-order symbol combinations inherit the bottom-up constraints, including the nonsymbolic representations to which they are attached, rather than being an independent top-down symbol-manipulative module with its connections to an input/output module open to being fixed in various extrinsically determined ways. > it isn't clear why *any* (intuitively) analog processing need > take place at all. I presume the stance of symbolic AI is that > sensory input affects the system via an isolable module which converts > incoming stimuli into symbolic representations. Imagine a vision > sub-system that converts incoming light into digital form at the > first stage, as it strikes a grid of photo-receptor surfaces, and is > entirely digital from there on in. Such a system is still "grounded" > in information-preserving representations in the sense you require. > In short, I don't see any *philosophical* reason why symbol-grounding > requires analog processing or a non-modular structure. It is exactly this modular scenario that I am calling into question. It is not clear at all that a cognitive system must conform to it. To get a device to be able to do what we can do we may have to stop thinking in terms of "isolable" input modules that go straight into symbolic representations. That may be enough to "ground" a conventional toy system, but, as I've said, such toy systems don't have a grounding problem in the first place, because nobody really believes they're thinking. To get closer to life-size devices -- devices that can generate *all* of our performance capacity, and hence may indeed be thinking -- we may have to turn to hybrid systems in which the symbolic functions are nonmodularly grounded, bottom-up, in the nonsymbolic ones. The problem is not a philosophical one, it's an empirical one: What looks as if it's likely to work, on the evidence and reasoning available? -- Stevan Harnad (609) - 921 7771 {bellcore, psuvax1, seismo, rutgers, packard} !princeton!mind!harnad harnad%mind@princeton.csnet harnad@mind.Princeton.EDU