Xref: utzoo comp.ai:3067 talk.philosophy.misc:1819 sci.lang:3912 Path: utzoo!attcan!uunet!lll-winken!lll-ncis!helios.ee.lbl.gov!nosc!ucsd!rutgers!elbereth.rutgers.edu!harnad From: harnad@elbereth.rutgers.edu (Stevan Harnad) Newsgroups: comp.ai,talk.philosophy.misc,sci.lang Subject: Re: Categorization Summary: MIScategorization and its consequenses Message-ID: Date: 12 Jan 89 16:00:33 GMT References: <681@cogsci.ucsd.EDU> <2959@uhccux.uhcc.hawaii.edu> <179@calmasd.GE.COM> Organization: Rutgers Univ., New Brunswick, N.J. Lines: 109 reiter@babbage.harvard.edu (Ehud Reiter) of Aiken Computation Lab Harvard, Cambridge, MA wrote: " the only reason *I* categorize a penguin as a bird is that I was taught " this in school. I doubt I would have put penguins in the same category " as robins if I had made up my own categories... biologists themselves " are debating what the "correct" taxonomic categories should be... " "bird" is a culturally defined and perhaps somewhat artificial " category, and may not have a simple definition as a set of features. Some categories are indeed arbitrary (such as "things bigger than a breadbox," "things I like to wear on Tuesdays," "the three most similar grapheme strings on the prior line," or "things I call 'throg' because the spirit moves me") but not the interesting and important ones, such as penguin or bird. Our nonarbitrary categorizations are constrained by their consequences -- the consequences of MIScategorizing. In practical cases like (edible) mushroom vs. (poisonous) toadstool, the consequences are obvious. In empirical science the consequences are subtler, but there (e.g., electron vs positron). In taxonomy, too, there are consequences of miscategorizing (failed predictions), and to the extent that there are no empirical consequences, there might indeed be an element of arbitrariness in taxonomy. You were taught that a penguin is a bird for reasons that you must study zoology, morphology and evolution to understand fully. Perhaps at the fuzzy frontiers of taxonomy, as elsewhere in science, there is still uncertainty. It still makes no difference what your layman's inclinations are when it comes to sorting birds. In a prescientific culture, this, like sorting the "elements," may have been an arbitrary cultural matter, but not once one is better informed. Nor is a bird an artificial category (like walking-stick): It is a natural kind, whose nature is to be discovered, not stipulated. And that nature (insofar as it has palpable or measurable consequences) will always constrain our categorizations. " there is a lot more to how categories are defined and used than " perceptual features... I also doubt *I* have a definition of "penguin" " as a set of perceptual features. All the penguins I have ever seen have " been in zoos, with signs telling me that they were penguins. I doubt I " could reliably identify an animal as a penguin without the presence of " those handy signs... I can still use categorization information, even " if I cannot define that category in terms of perceptual features. This " is even more true for abstract categories - what set of perceptual " features identify Republicans? Lawyers? Widows? I think you (and most of us) could reliably identify a penguin without the help of a sign (though the sign might have been helpful feedback when we first learned to identify them). You may not have a DEFINITION, but you can certainly pick 'em out; it's hence reasonable to conclude that something in your head is successfully detecting them on the basis of reliable features in the sensory representation that you cannot verbalize (or have not yet learned to). Regarding abstract categories: As I suggested in my original posting, I think they are GROUNDED in concrete categories (and in the book I sketch a symbol grounding model that accomplishes this). The perceptual representations in which they are grounded, like their sensory features, may not be available to introspection. Moreover, there may be abstract, symbolic rules guiding our abstract categorizations that are likewise not introspectively obvious or accessible, yet reliably guiding our categorizations. That's just an inference, of course, but a pretty reasonable one; the alternative is either the incoherent Roschian view (which purports to get all-or-none categorization performance from models for graded typicality judgments) or else just plain magic. " should we put red pandas and giant pandas into their own " class, "panda"? As language users, the choice is ours - but whatever " choice we make, it will be something that has to be taught, not " something that is intuitively obvious. And if another culture makes a " different choice than we do, we would not be justified in saying they " were "wrong" and we were "right". As I suggested, the taxonomy of natural kinds -- to the extent that it is empirical (with testable consequences arising from MIScategorization) rather than hermeneutic (i.e., just a matter of interpretation, subjective similarity, or arbitrary convention) IS a matter of "right" vs. "wrong." I'd like to account for our ability to categorize birds, penguins and electrons reliably and correctly. I'll leave the hermeneutics to the Roschians. [I have made a distinction between two kinds of categorization and categorization task that might be instructive here: "ad lib" vs. "imposed" categorization. In an ad lib categorization task (as used by Tversky and others who investigate similarity), instances are presented to the subject, who is then to sort them as he sees fit. In "imposed" categorization, there is feedback as to whether the categorization is correct or incorrect (this is also called "supervised learning"): Miscategorization has consequences in imposed categorization, but not in ad lib categorization (which I just consider to be a form of impressionistic similarity judgment). Typicality judgments are really just similarity judgments too; similarity is, by nature, graded, continuous, and a matter of degree. Categorization, on the other hand, is discrete, categorical and all-or-none. I think that imposed categorization tasks are the right ones to look at in modeling categorization, along with their feedback from miscategorization. They are representative of reality and the constraints it imposes on how we sort things. Ad lib categorization is not really categorization AT ALL, but just subjective similarity judgments based on the default similarity structure of the set of inputs, as dictated either by our sensory systems, our prior (imposed) categories, or both.] -- Stevan Harnad INTERNET: harnad@confidence.princeton.edu harnad@princeton.edu srh@flash.bellcore.com harnad@elbereth.rutgers.edu harnad@princeton.uucp BITNET: harnad@pucc.bitnet CSNET: harnad%princeton.edu@relay.cs.net (609)-921-7771