Xref: utzoo comp.ai:3067 talk.philosophy.misc:1819 sci.lang:3912
Path: utzoo!attcan!uunet!lll-winken!lll-ncis!helios.ee.lbl.gov!nosc!ucsd!rutgers!elbereth.rutgers.edu!harnad
From: harnad@elbereth.rutgers.edu (Stevan Harnad)
Newsgroups: comp.ai,talk.philosophy.misc,sci.lang
Subject: Re: Categorization
Summary: MIScategorization and its consequenses
Message-ID: <Jan.12.11.00.32.1989.25536@elbereth.rutgers.edu>
Date: 12 Jan 89 16:00:33 GMT
References: <681@cogsci.ucsd.EDU> <2959@uhccux.uhcc.hawaii.edu> <179@calmasd.GE.COM>
Organization: Rutgers Univ., New Brunswick, N.J.
Lines: 109


reiter@babbage.harvard.edu (Ehud Reiter) of Aiken Computation Lab
Harvard, Cambridge, MA wrote:

" the only reason *I* categorize a penguin as a bird is that I was taught
" this in school. I doubt I would have put penguins in the same category
" as robins if I had made up my own categories... biologists themselves
" are debating what the "correct" taxonomic categories should be...
" "bird" is a culturally defined and perhaps somewhat artificial
" category, and may not have a simple definition as a set of features.

Some categories are indeed arbitrary (such as "things bigger than a
breadbox," "things I like to wear on Tuesdays," "the three most
similar grapheme strings on the prior line," or "things I call 'throg'
because the spirit moves me") but not the interesting and important
ones, such as penguin or bird. Our nonarbitrary categorizations are
constrained by their consequences -- the consequences of
MIScategorizing. In practical cases like (edible) mushroom vs. (poisonous)
toadstool, the consequences are obvious. In empirical science the
consequences are subtler, but there (e.g., electron vs positron). In
taxonomy, too, there are consequences of miscategorizing (failed
predictions), and to the extent that there are no empirical
consequences, there might indeed be an element of arbitrariness in
taxonomy.

You were taught that a penguin is a bird for reasons that you must
study zoology, morphology and evolution to understand fully. Perhaps
at the fuzzy frontiers of taxonomy, as elsewhere in science, there is
still uncertainty. It still makes no difference what your layman's
inclinations are when it comes to sorting birds. In a prescientific
culture, this, like sorting the "elements," may have been an arbitrary
cultural matter, but not once one is better informed. Nor is a bird
an artificial category (like walking-stick): It is a natural kind,
whose nature is to be discovered, not stipulated. And that nature
(insofar as it has palpable or measurable consequences) will always
constrain our categorizations.

" there is a lot more to how categories are defined and used than
" perceptual features... I also doubt *I* have a definition of "penguin"
" as a set of perceptual features. All the penguins I have ever seen have
" been in zoos, with signs telling me that they were penguins. I doubt I
" could reliably identify an animal as a penguin without the presence of
" those handy signs... I can still use categorization information, even
" if I cannot define that category in terms of perceptual features. This
" is even more true for abstract categories - what set of perceptual
" features identify Republicans? Lawyers? Widows?

I think you (and most of us) could reliably identify a penguin without
the help of a sign (though the sign might have been helpful feedback
when we first learned to identify them). You may not have a
DEFINITION, but you can certainly pick 'em out; it's hence reasonable to
conclude that something in your head is successfully detecting them
on the basis of reliable features in the sensory representation that
you cannot verbalize (or have not yet learned to).

Regarding abstract categories: As I suggested in my original posting, I
think they are GROUNDED in concrete categories (and in the book I sketch
a symbol grounding model that accomplishes this). The perceptual
representations in which they are grounded, like their sensory
features, may not be available to introspection. Moreover, there may be
abstract, symbolic rules guiding our abstract categorizations
that are likewise not introspectively obvious or accessible, yet
reliably guiding our categorizations. That's just an inference, of
course, but a pretty reasonable one; the alternative is either the
incoherent Roschian view (which purports to get all-or-none
categorization performance from models for graded typicality judgments)
or else just plain magic.

" should we put red pandas and giant pandas into their own
" class, "panda"? As language users, the choice is ours - but whatever
" choice we make, it will be something that has to be taught, not
" something that is intuitively obvious. And if another culture makes a
" different choice than we do, we would not be justified in saying they
" were "wrong" and we were "right".

As I suggested, the taxonomy of natural kinds -- to the extent that it
is empirical (with testable consequences arising from
MIScategorization) rather than hermeneutic (i.e., just a matter of
interpretation, subjective similarity, or arbitrary convention) IS a
matter of "right" vs. "wrong." I'd like to account for our ability to
categorize birds, penguins and electrons reliably and correctly. I'll
leave the hermeneutics to the Roschians.

[I have made a distinction between two kinds of categorization and
categorization task that might be instructive here: "ad lib" vs. "imposed"
categorization. In an ad lib categorization task (as used by Tversky
and others who investigate similarity), instances are presented to the
subject, who is then to sort them as he sees fit. In "imposed"
categorization, there is feedback as to whether the categorization is
correct or incorrect (this is also called "supervised learning"):
Miscategorization has consequences in imposed categorization, but not
in ad lib categorization (which I just consider to be a form of 
impressionistic similarity judgment). Typicality judgments are really
just similarity judgments too; similarity is, by nature, graded,
continuous, and a matter of degree. Categorization, on the other hand,
is discrete, categorical and all-or-none. I think that imposed
categorization tasks are the right ones to look at in modeling
categorization, along with their feedback from miscategorization. They
are representative of reality and the constraints it imposes on how we
sort things. Ad lib categorization is not really categorization AT ALL,
but just subjective similarity judgments based on the default similarity
structure of the set of inputs, as dictated either by our sensory
systems, our prior (imposed) categories, or both.]
-- 
Stevan Harnad INTERNET:  harnad@confidence.princeton.edu    harnad@princeton.edu
srh@flash.bellcore.com    harnad@elbereth.rutgers.edu      harnad@princeton.uucp
BITNET:   harnad@pucc.bitnet           CSNET:  harnad%princeton.edu@relay.cs.net
(609)-921-7771