Xref: utzoo comp.ai:3074 talk.philosophy.misc:1822 sci.lang:3919
Path: utzoo!utgpu!attcan!uunet!lll-winken!ames!xanth!ukma!husc6!endor!reiter
From: reiter@endor.harvard.edu (Ehud Reiter)
Newsgroups: comp.ai,talk.philosophy.misc,sci.lang
Subject: Re: Categorization
Summary: Explanation of biological classifications
Message-ID: <976@husc6.harvard.edu>
Date: 13 Jan 89 02:21:35 GMT
References: <681@cogsci.ucsd.EDU> <2959@uhccux.uhcc.hawaii.edu> <179@calmasd.GE.COM> <Jan.12.11.00.32.1989.25536@elbereth.rutgers.edu>
Sender: news@husc6.harvard.edu
Reply-To: reiter@harvard.UUCP (Ehud Reiter)
Organization: Aiken Computation Lab Harvard, Cambridge, MA
Lines: 131

This is a long posting that discusses biological classifications.  Be warned.

Steve Harnad writes:
>You were taught that a penguin is a bird for reasons that you must
>study zoology, morphology and evolution to understand fully. Perhaps
>at the fuzzy frontiers of taxonomy, as elsewhere in science, there is
>still uncertainty. It still makes no difference what your layman's
>inclinations are when it comes to sorting birds. In a prescientific
>culture, this, like sorting the "elements," may have been an arbitrary
>cultural matter, but not once one is better informed. Nor is a bird
>an artificial category (like walking-stick): It is a natural kind,
>whose nature is to be discovered, not stipulated. And that nature
>(insofar as it has palpable or measurable consequences) will always
>constrain our categorizations.

Biological categories are a lot more arbitrary than many people seem to
realize (this is one of the topics Lakoff discusses in his book).  For
anyone who is really interested in the topic, I highly recommend reading
"Theories of Biological Classification and Their History", chapter 4 in
Ernst Mayr, PRINCIPLES OF SYSTEMATIC ZOOLOGY, McGraw-Hill, 1969.  Very
roughly, the situation is as follows:

The only things a biologist can directly observe or derive from direct
observation are the composition of species and the phylogenetic tree
(evolutionary history of species).  A species is defined to be an
interbreeding group of individuals.  Given this definition, it is
possible, at least in principle, to experimentally determine whether or
not a group of animals belong to the same species.  This CANNOT be done
for higher classifications (families, generas, classes, etc) - there is
no direct experimental test that determines whether or not a group of
individuals belongs to the same genera, class, or whatever.  At least
in this sense, higher classifications (e.g. "bird", "penguin",
"mammal", etc) are human inventions, not distinctions that are mandated
by nature.  To quote Mayr, pg 98

	"...the major function of a classification [is] to be useful.
	 A classification is a communication system, and the best one is
	 that which combines greatest information content with greatest
	 ease of information retrieval."

Note that species have physiological properties that are shared by all
members (e.g. all humans have two eyes), and also physiological
properties which vary in the population (e.g. hair color in humans).

The other thing that can be experimentally determined, at least in principle,
is the the phylogenetic tree, that is the history of which species descended
from which other species.  The phylogenetic tree can be determined, for
example, by examining differences in the DNA of different species.

So, species and the phylogenetic tree are the only biological observables.
Biologists have been debating for centuries how this data can be used to
form higher-order categories like "bird" or "mammal".  There are three
main approaches:

    Phenetic: ignore the phylogenetic tree, and form higher-order categories
by clustering species along their common physiological properties.  So, for
example, we might decide that one higher-order category is "all animals with
feathers", and call this category "birds".  Obviously, doing the clustering
requires deciding which physiological properties are important, and we can
well imagine a different culture deciding that "having feathers" was a less
important property than "being able to fly", and thus using the alternate
category "all animals that can fly".
    By necessity, all taxonomies created before the theory of evolution
(including Linnaeus's classification and tribal classifications) are based
on phenetic principles.

   Cladistic: ignore physiological properties, and create a higher-order
category for each fork in the phylogenetic tree.  In computer-science
terms, each subtree of the complete phylogenetic tree would form a
higher-order category.  The advantage of cladism is that it is in some
sense completely objective, since the cladistic taxonomy is completely
determined by the phylogenetic tree.  The problem with cladism is that
it leads to a lot of categories which may not be very useful, e.g.
"crocodiles and birds", "mammals and turtles", "primates and rodents".

   Evolutionary: use both physiological properties and the phylogenetic
tree.  Very roughly, the evolutionary taxonomist picks out certain
phylogenetic subtrees (or possibly pruned subtrees, i.e. subtrees with
some branches removed) that seem to correspond to species with shared
physiological properties, and makes these (pruned) subtrees into his
higher-order categories.  For example, he might note that all descendants
of the species Archaeopteryx share some physiological properties, and
decide to call Archaeopteryx and all its descendants the class of "birds".


This ends my 2-bit review of biological classification.  I hope the reader
takes away two points in particular:

	- higher-order classifications like "bird" are human creations,
not directly observable distinctions that are mandated by nature.

	- biologists have differing views on what the "correct" higher-order
classifications are, and how they should be defined.  Besides the
phenetic/cladistic/evolutionary split, note that both phenetic
and evolutionary classifications require the taxonomist to exercise
a good deal of judgement in deciding what the higher-order categories
should be.


Steve Harnad also writes:
>I think you (and most of us) could reliably identify a penguin without
>the help of a sign (though the sign might have been helpful feedback
>when we first learned to identify them). You may not have a
>DEFINITION, but you can certainly pick 'em out; it's hence reasonable to
>conclude that something in your head is successfully detecting them
>on the basis of reliable features in the sensory representation that
>you cannot verbalize (or have not yet learned to).

Perhaps I could identify a penguin from sight alone (I'm not sure), but
I doubt I could identify, say, a platypus from sight alone.  Yet, that
doesn't stop me froming knowing things about platypuses (e.g. they lay
eggs and live in Australia).  As a human being, I'm not restricted to
making categorization decisions from sense data alone - I have the
capability to use language, and to know that object X is a platypus
because I was explictly told that object X is a platypus.


Walt Peterson writes
>Actually, the categorization of penguins and all other birds is quite
>easy.  Unlike most other taxonomic categories, biologist are in
>agreement as to what creatures are members of the Class Aves.  All
>animals that have feathers are birds and all birds have feathers.
>This is not just a "cultural bias" nor is it an arbitrary rule.

This is true.  One definition of "bird" is all animals that have feathers
(another is all descendants of Archaeopteryx).  But the fact that the
category is well-defined does not mean it is not arbitrary.

					Ehud Reiter
					reiter@harvard	(ARPA,BITNET,UUCP)
					reiter@harvard.harvard.EDU  (new ARPA)