Xref: utzoo comp.ai:3236 talk.philosophy.misc:1922 sci.lang:4038
Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!csd4.milw.wisc.edu!uxc!deimos!rutgers!elbereth.rutgers.edu!harnad
From: harnad@elbereth.rutgers.edu (Stevan Harnad)
Newsgroups: comp.ai,talk.philosophy.misc,sci.lang
Subject: Re: Categorization
Summary: (1) Categorization is not primarily a linguistic problem
	 (2) Children's categories are not under/overextensions of ours
Keywords: Crisp Sets and Fuzzy Sets
Message-ID: <Jan.28.21.16.52.1989.10962@elbereth.rutgers.edu>
Date: 29 Jan 89 02:16:54 GMT
References: <681@cogsci.ucsd.EDU> <2959@uhccux.uhcc.hawaii.edu> <2899@xyzzy.UUCP> <9750@bcsaic.UUCP>
Organization: Rutgers Univ., New Brunswick, N.J.
Lines: 94


rwojcik@bcsaic.UUCP (Rick Wojcik) of Boeing Computer Services AI
Center, Seattle wrote:

" Linguists have long been aware of the problems with all-or-none
" categories. Stevan... simply defines the word 'categorization' to suit
" his all-or-none criterion, without any regard for the way humans
" actually assign categories. But consider the classic examples of
" 'semantic vagueness'. We have the mental illusion that mountains and
" waves are discrete objects. Questions like 'How many mountains are
" there in the Cascades?' or 'How many waves are there in the ocean?' are
" semantically well-formed, but impossible to answer from a conceptual
" point of view. There are no natural discrete boundaries to these
" categories, such that you can always tell where one mountain or wave
" leaves off and another begins.

The problem of how we categorize (i.e., sort and label objects
and states of affairs) is basically not a linguistic one, though it of
course makes contact with linguistics at some level (because the
category labels form our lexicon, and language allows us not only to
label categories, but to describe them). For example, how we manage to
sort and label mountains and waves is basically a perceptual problem:
What are the internal representations that allow us to categorize
members and nonmembers of these categories successfully, in those many
cases in which we are able to do so? It is those who ignore (or take
for granted) this enormous core of reliable, correct, all-or-none
categorization performance who are not showing due "regard for the way
humans actually assign categories." The solution to the problem of HOW
people manage to sort and label things as they actually do will not
come from linguistics, it will come from a theory of perceptual and
cognitive representation.

Nor will how we categorize in most cases be determined from our
introspective discourse about how we categorize, any more than how we
perceive will be determined from our introspections about our
perception. The explanation will come from theoretical inference and
the building and testing of causal models for the underlying
mechanism.

I also remind the reader that the question to which these discussions
were addresses was whether or not the representation that allows us to
categorize is "classical," i.e., consists of features that are
necessary and sufficient to sort members from nonmembers, NOT whether
or not we can sort EVERY instance of ANYTHING we ever encounter in an
all-or-none fashion. The question under discussion is simply moot for
cases in which we CANNOT sort members from nonmembers (e.g., "vague"
cases). Note that this point is a logical, not an empirical one;
its only empirical aspect is the evidence (and it's all over the map --
unless you're in the grip of an introspective theory) that there do
indeed exist myriad categories that we can and do sort and label in a
reliable, correct, all-or-none fashion.

" A child might use the word 'doggie' on different occasions to refer to
" four-legged things, furry things (e.g. a blanket), things that move,
" etc. Overextensions and underextensions seem to involve a fine-tuning
" of categorization that looks more like the so-called 'classical' type.

Indeed it does -- and the process leads ultimately to our asymptotic
core of perfectly "classical" categories. My only quarrel with this
terminology has been that to call this "overextension" and
"underextension" is to adopt too omniscient or ontological a view.
According to my theory, ALL categories are provisional and approximate,
including our adult ones. Their context of interconfusable alternatives
could always in principle be widened so as to show up our former
representations as having been over- or underextended (based on
hindsight). At a given point in its experience a child's category may
accordingly NOT be over- or underextended relative to the actual sample
of alternatives he has so far encountered and the feedback he has so
far received from the consequences of miscategorization; the
over/underextension may only be relative to OUR categories and their
larger and more representative contexts. Subsequent experience may
force the child to revise his categories and eventually converge on
ours, but that does not necessarily mean they are over- underextended
at THIS point.

Sometimes we expect too much from children and other category learners
on the basis of the data available to them; for similar reasons we
sometimes also attribute too much to them (as in the chimpanzee
"language" studies). It is only by taking account of the categorizer's
actual sorting performance in its actual context of confusable
alternatives that one can infer a category's actual extension and
intension, and hence its underlying representation. (On the other hand,
over- and underextension CAN be be defined during this actual learning
phase WITHIN the child's actual local context of alternatives, while
miscategorization with feedback is going on; this, however, is probably
more perspicuously described as the formation or revision of the
child's own provisional categories, guided by the consequences of
miscategorization, rather than as the "fine-tuning" of OUR categories.)
-- 
Stevan Harnad INTERNET:  harnad@confidence.princeton.edu    harnad@princeton.edu
srh@flash.bellcore.com    harnad@elbereth.rutgers.edu      harnad@princeton.uucp
BITNET:   harnad@pucc.bitnet           CSNET:  harnad%princeton.edu@relay.cs.net
(609)-921-7771