Path: utzoo!mnetor!uunet!seismo!sundc!pitstop!sun!decwrl!decvax!ucbvax!CS.ROCHESTER.EDU!nl-kr-request
From: nl-kr-request@CS.ROCHESTER.EDU (NL-KR Moderator Brad Miller)
Newsgroups: comp.ai.nlang-know-rep
Subject: NL-KR Digest Volume 4 No. 3
Message-ID: <880112202345.6.MILLER@DOUGHNUT.CS.ROCHESTER.EDU>
Date: 13 Jan 88 01:23:00 GMT
Sender: daemon@ucbvax.BERKELEY.EDU
Reply-To: nl-kr@cs.rochester.edu
Organization: University of Rochester, Department of Computer Science
Lines: 462
Approved: nl-kr@cs.rochester.edu


NL-KR Digest             (1/12/88 20:19:31)            Volume 4 Number 3

Today's Topics:
        Dependency Grammar / Variable Word Order
        empirical science of language
        Natural vs. Programming Languages
        Re: Language Learning
        Re: online dictionaries
        Seeking machine readable dictionaries in French
        
Submissions: NL-KR@CS.ROCHESTER.EDU 
Requests, policy: NL-KR-REQUEST@CS.ROCHESTER.EDU
----------------------------------------------------------------------

Date: Fri, 1 Jan 88 00:29 EST
From: Michael Covington <MCOVINGT%UGA.BITNET@forsythe.stanford.edu>
Subject: Dependency Grammar / Variable Word Order

[Excerpted from PROLOG Digest]

  I would like to hear of any work, published or unpublished, on the
following topics:

(1) Parsing using a dependency rather than a constituency grammar,
i.e., by establishing grammatical relations between individual words
rather than gathering words into groups.

(2) Parsing strategies that have been used successfully with languages
that have variable word order, such as Russian or Latin.

I am familiar with GPSG, HPSG, and ID/LP grammars, and with various
proofs that (some) dependency grammars are equivalent to (some) con-
stituency or phrase-structure grammars. However, there must have been
some good ideas in circulation much earlier, perhaps in connection
with early machine translation work. 
------------------------------

Date: Tue, 5 Jan 88 11:50 EST
From: Bruce E. Nevin <bnevin@cch.bbn.com>
Subject: empirical science of language

To Walter Rolandi <rolandi@gollum.columbia.nrc.com>

The status of linguistics as a science has been a vexed question for a
very long time.  There are a number of good reasons.  Probably the
central one is this:  in all other sciences and in mathematics, you can
rely on the shared understanding of natural language to provide a
metalanguage for your specialized notations and argumentation.  In
linguistics you cannot without begging fundamental questions that define
the field.  There is an exactly parallel difficulty in psychology:  a
psychological model must account for the investigator on the same terms
as it accounts for the object of investigation.  The carefully crafted
suspension of subjectivity that is so crucial to experimental method
becomes unattainable when subjectivity itself is the subject.  (See
Winograd's recent work, e.g. _Understanding Computers and Cognition_ for
reasons why computer modelling of natural language is not possible, on
the usual construal of what computer modelling is.  I have references to
work that gets around this "Framer Problem" if you are interested.)

For a more explicit and comprehensive critique of generative
linguistics, I recommend Maurice Gross's paper "On the Failure of
Generative Grammar" in _Language_ 55.4:859-85 (1979).  Despite its
prominent publication in the most prestigious journal in the field by an
acknowledged expert, there has been NO REJOINDER to this paper.  It is
hard to construe this rather astonishing silence other than as a tacit
confession on the part of generativists.  Gross also offers very
specific and detailed alternatives to the Generativist programme, so we
are not just talking about carping negativity here.

You wanted an empirically based natural science of language.
For more on the empirically based paradigm of linguistics as a natural
science that Gross draws upon, you should become acquainted with the
recent work of Zellig Harris:  _Mathematical Structures of Language_
(1968); _A Grammar of English on Mathematical Principles_ (1982--I
reviewed this in _Computational Linguistics_ in 1984); _The
Form of Information in Science_ (with Gottfried, Ryckman, and others,
1987); a book on the theory of language forthcoming from Oxford whose
title I forget.  Harris is at Columbia where you are, in the Center for
the [Study of the?] Social Sciences, so it shouldn't be too hard to find
out what he's about.  He gave the Bampton Lectures a year ago October at
Columbia, and his series on language and information will be published
by Columbia.

Chomsky was Harris's grad student in the late '40s and early '50s when
Harris began to develop the notion of linguistic transformations as a
way of regularizing texts for discourse analysis.  Chomsky has
attributed his breach with Harris to a conflict between his own
Rationalist philosophy and (what he considers) Harris's Empiricist
position.  

Chomsky borrowed notation from mathematics (Post's "production systems")
to represent a form of syntactic analysis that had been developed in the
1920s and '30s and formalized in 1946 in a couple of papers by Harris
and Rulon Wells.  (Harris noted the major deficiency--inability to
account for the heads of endocentric constructions in a natural way--in
his 1946 paper, and his workaround was borrowed as the X-bar notation
when the generativists realized something was wrong almost thirty years
later.  Harris developed string grammar in the late 1950s as an
alternative.  Joshi proved Harris's suggestion that you could refine the
word subclasses of a string grammar to get a transformational grammar,
and Sager & Grishman exploited this in their very successful LSP system.
Joshi's tree-adjoining grammars are a more or less direct development
from string grammar, combining very restricted rewrite rules for
exocentric constructions with adjunction rules for endocentric
constructions.)

Because of his presupposition of innate hard-wired neural correlates of
linguistic entities, Chomksy shifted attention from language to the now
familiar rewrite rules and phrase structure trees of his notation.
(Backus and Naur developed BNF notation 5 or 6 years later as a
metalanguage to account in a unified way for the proliferating variety
of computer languages.  The affinity with trends in nascent computer
science appears to have lent at least credibility to the rapid expansion
of Generativist missionaries.  Certainly, Harris supported his (in the
event ungrateful) student, for example recommending Chomsky as a
substitute for himself to give the keynote address at the 1962 (?)
International Congress of Linguists.)  A historically arbitrary choice
of notation has become reified as the form of data in linguistics.  Many
problems stem from the characteristics of PSG:  that there is no
principled way to control proliferation of nonterminal symbols; that
nonterminal symbols have no semantic value (names of nonterminals are
arbitrary, relation to words of "surface strings" is indirect; that the
rules "overgenerate" and extraneous structures must be filtered, and so
on.

I can't go into a detailed comparison of the two paradigms here.  For
Chomsky, transformations were derivation operations on the abstract
objects generated by rewrite rules.  For Harris, transformations were
mappings in the set of sentences from which, over the next thirty years,
he was able to develop a mathematical theory of language and of
linguistic information with rules of composition and derivation.  For
Chomsky, having something in mind to say and putting it into words is a
matter for a separate theory of performance, which his theory of
competence merely constrains.  For Harris, this obfuscatory dualism is
unnecessary, a fact that alone recommends him to computational
linguists.  For Harris, it is essential that the grammatical description
add no unnecessary structure to the redundancy that language uses for
informational purposes (extrinsic redundancy), and the product of a
maximally efficient grammar is a representation of the objective
information in texts.  For Chomsky, the great complexity of his
grammatical machinery is yet another evidence that it corresponds to
something innate, else how could a child possibly learn a language?  (It
is a fascinating exercise to compare what Harris demonstrates is
sufficient for language to what Generativists claim is necessary for
language.)

According to Kuhn, a scientific revolution takes at least two
generations.  For old die-hards the new paradigm just doesn't make
sense, they can't make the "paradigm shift" and remain unconverted; for
subsequent generations of entrants to the field, the new paradigm seems
obviously better, and they take over as the die-hards die off.  But the
Generativist revolution in linguistics has failed to become the obvious
way to talk about language, so we have to say that it was more a
political revolution than an intellectual one.

It has, however, done enormous damage to the field.  Linguistics now has
lost credibility.  Perhaps it _is best that computer scientists
re-invent the field under a new name.  It is unfortunate in the process
to reinvent a number of cubical and hyperbolic wheels, but there may be
benefit even there--sort of like the value of recessive genes in the
gene pool that are dysfunctional at time x and adaptive at time y.
Certainly, we can now more easily find out what American structuralists
were really about, for example, without the rhetorical distortions of
the Generativist gospel.

PS:  I just noted an article by Victor Yngve in the last issue of
Theoretical Linguistics, "Linguistics among the sciences."  I have not
yet read it.

bn@cch.bbn.com
<usual_disclaimer>

------------------------------

Date: Tue, 5 Jan 88 13:06 EST
From: Rick Wojcik <rwojcik@bcsaic.UUCP>
Subject: Natural vs. Programming Languages

>From: John Chambers <jc@minya.UUCP>
[remarks that linguistics had little to contribute to computer modelling
 of natural language]
>This has had a large shock effect on linguistics, but linguistics has 
>had little influence on computing.  For people that claim to be scientists, 
>this is pretty damning.

I'm sure that you don't mean to ignore the important contributions made
by Noam Chomsky on the formal description of artificial languages.  I
don't think that you can study compiler design without learning
something about the value of formal linguistic theory in computing.  The
real problem in the computer community is that natural language models
have ignored the important differences between natural and artificial
languages.  I take it as a criticism of Chomsky that he has sometimes
been guilty of the same sin.

>In the last few years, I've tried occasionally to interject into some
>of the discussions in this newsgroup some examples tieing computer
>languages into the discussions on human languages.  The responses have
>been quite revealing.  Time and again, people have sent me some really
>good flames about how stupid I was to think of computer languages and
>human languages together.  (After all, can you write poetry in a computer
>language? :-)

I don't think that it is stupid to think of computer languages and human
languages together.  How else can you learn what the differences are?  I
have seen some computer programs that I consider poetry.

>Not all of these flames were from novices; some were from professional 
>linguists.  Nearly all make the claim that a "natural" human language
>is somehow different from "artificial" computer languages.

They sure are.  After all these years, can't you think of any
differences?

>My main reaction to all these flames is:  Who do you think created the
>computer languages?  Computers? 

In a manner of speaking, yes.  The structure of computer language
follows very much from the nature of computers, just as the structure of
natural language follows from the nature of humans.  The evolution of
computer languages has always been in the direction of trying to make
them more like natural language.  But the similarities are quite
superficial.  Computer languages are not designed to be unambiguous, but
they are designed to be disambiguated from "local context"--i.e. by
reference to computer code alone.  Natural languages are far more
ambiguous than nonlinguists tend to imagine.  The reason for this is
that speakers perceive natural language as relatively unambiguous.  They do
so because human minds resolve ambiguity in pragmatic and
semantic contexts.  Cognitive scientists have really only begun to
understand the mechanisms that are needed to disambiguate human
language.  Recent work at Berkeley (cf. Lakoff's Women, Fire, and
Dangerous Things) has added an additional wrinkle: the role of metaphor
in natural language understanding and cognition.  Nothing like
metaphorical reasoning is to be found in computer languages.

>It's also fun to taunt the linguists with a challenge:  If you really
>understand how languages work, then obviously you should be able to
>express your theory as a "model" in the form of a computer program.
>Most other scientific fields consider computer modeling to be routine 
>nowadays.  So let's see those working computer models of English,
>Japanese and Quechua!

For a linguist to claim to "understand how languages work" would be to
beg the question.  Several natural language
parsers are loosely based on linguistic theories--GPSG and GB, for
example.  Bresnan's LFG theory grows directly out of a computational
model.  The point about computer modelling is good, but it ignores the
limitations of computer modelling.  There need to be many more advances
in computer hardware and software before our models become reasonable
approximations of what the human mind can do.

===========
Rick Wojcik   rwojcik@boeing.com

------------------------------

Date: Tue, 5 Jan 88 16:32 EST
From: Rick Wojcik <rwojcik@bcsaic.UUCP>
Subject: Re: Language Learning (a Turing test)

I have come across a couple of items that should be of interest to this
news topic.  One is Rene Coppieters' Sept. '87 Language article
"Competence differences between native and non-native speakers".
Coppieters studied the intuitions of native French speakers and "Near
Native" French speakers who learned French after puberty.  She found
that the NS group had substantially different intuitions of
grammaticality than those in the NNS group.  This supports the Critical
Period Hypothesis (CPH).  

The second item relates to a paper presented at the recent LSA meeting
in San Francisco: Mark Patkowski's "Age and accent in a second language:
A critical review."  (Mark is an assistant professor of linguistics at
Brooklyn College.)  His review concludes that "the literature provides
support (albeit indirect) for the notion of an 'optimal' period for the
acquisition of phonology in a second language in particular, and for the
'critical period hypothesis' in general." After the paper, a commentator
remarked that research by Newport and others shows a critical period
effect in the acquisition of American Sign Language (ASL).  This work is
about to be published in Science and in Cognitive Psychology. If true,
such work could be even more damaging for those who minimize differences
between adult and child acquisition.  The ASL study shows a deficit in
the acquisition of syntax, not just phonology. 
-- 

===========
Rick Wojcik   rwojcik@boeing.com


------------------------------

Date: Tue, 5 Jan 88 20:38 EST
From: Rob McConeghy <malibo@arizona.edu>
Subject: Re: Language Learning (a Turing test)


In article <120@psc90.UUCP>, tos@psc90.UUCP (Dr. Thomas Schlesinger) writes:
> 
> My own children were 3 and 4 respectively when I spent three years in
> Frankfurt, Germany.  It fascinated me to watch how they had no inkling
> of their total bilinguality (yes, I know a horrible "word").  But
> when a German spoke to them they automatically answered in German, and
> when an American spoke to them they answered in English.  If I spoke
> to them in the "wrong" language, i.e. German, they'd get angry at me.
> But they didn't know why... they didn't really know what "languages"
> were and that they were "bilingual."  

Pardon our curiosity, but did your children remain bilingual after you left
Germany ?

I had cousins who at slightly older ages spent a year in Lausanne Switzerland
where they attended the local schools. When they returned they had a fairly
low level command of French - i.e. their accents were good and they could
understand simple everyday statements and questions and respond in a semi-
coherent manner. They had had no previous exposure to French and did not
live in a bilingual household, only English was spoken at home. After a
year or two back in the states they had completely forgotten all the French
they had picked up.
This possibly illustrates two aspects of childhood language acquisition.
First of all it is not instantaneous. It does take quite a long time, even
in a high immersion environment, a year is not sufficient.
Secondly, motivation is a factor. My cousins were not really highly motivated
to learn French. They knew they would be going home at the end of a year, and
they could still speak English at home to their parents and each other.
They also had no reason to retain or expand on their French after they
returned to the US.
Of the three cousins, only the youngest, (about 5 or 6 as I recall), really
learned quite a lot of French, the 9 or 10 year old also learned a lot, the
14 or 15 year old not much at all (or at least we couldn't get any out of her,
pretty typical non-communicative teenager). This is mainly as was to be
expected except for the difference between the 5 year old and the 10 year old.

If we accept the theory that there are automatic language acquisition processes
that are at work in children's heads, we should consider whether various parts
of them shut down at various times rather than all at once at age 12 or so.
As any elementary school teacher or parent knows, children do not master all
aspects of their native language at once, nor do they master all aspects of
thinking with equal speed. Language acquisition in children probably needs
to be studied in close relation to the child's acquisition of other mental
abilities. 
Does anyone know of studies where this has been done, especially in relation
to the acquisition of multiple languages by children?


------------------------------

Date: Thu, 7 Jan 88 19:53 EST
From: Doug Rudoff <doug@wiley.UUCP>
Subject: Re: Language Learning

This is slightly off the subject but ...

When I was a junior in college, I spent a year studying electrical
engineering in Lausanne, Switzerland. Over the course of the year I
learned to speak French somewhat fluently (i.e. I had no problem
thinking in French without resorting to thinking of the English
translation). It has been about four years since then, and due to
lack of speaking French, I really have to struggle and think if I try
to speak it. Also, even when I was able to speak French fluently, I
had an atrocious accent.

On occasion I have dreams in which people speak French to me. When
they speak French, it is perfect and unaccented. Yet, when I speak
French in my dreams, I speak as well (or really as poorly) as I do
during my waking hours.

From this observation I get the impression that speaking a language
and understanding a language by listening are totally separate brain
processes. Have studies been made to this effect ?

Another observation I have is related to my ability to learn languages
by speech or writing (I've studied Spanish, German and French). I have
always found that I have an easy time learning a language in writing.
I have a very hard time picking things up by ear. It seems that it is
easier for me to convert visual information (writing) to language
concepts than it is to convert aural information (speech) to language
concepts. Other people I know are the are the exact reverse.

=-=-=-=-=-=-=-=-=-=-=-=-=--=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Doug RUDOFF        TRW Inc, Redondo Beach, CA  {cit-vax,trwrb,uunet}!wiley!doug
H: (213) 318-9218  W: (213) 812-2768               wiley!doug@csvax.caltech.edu
=-=-=-=-=-=-=-=-=-=-=-=-=--=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

------------------------------

Date: Wed, 6 Jan 88 14:29 EST
From: Mary Patricia Lowe <mary@csd4.milw.wisc.edu>
Subject: Re: online dictionaries

In article <29@gollum.Columbia.NCR.COM> rolandi@gollum.UUCP () writes:
>
>	the Microsoft CD ROM version of the American Heritage Dictionary
>	the OED from Oxford University Press
>
>If anyone can locate these sources, I would appreciate what they find out.

In the January 1988 issue of IEEE Spectrum, the section on Tools and Toys
(p. 73) contains a short blurb on the Microsoft Bookshelf. The CD-ROM
includes the following reference works:

	The World Almanac and Book of Facts,
	The American Heritage Dictionary,
	The U.S. ZIP Code Directory,
	The Chicago Manual of Style,
	Bartlett's Familiar Quotations,
	Roget's II: Electronic Thesaurus,
	Houghton Mifflin Spelling Verifier and Corrector,
	Houghton Mifflin Usage Alert,
	Business Information Sources.

For more information, contact: Microsoft Corp., Box 97017, Redmond, WA. 98073,
(206)-882-8088.

			-Mary

Mary Patricia Lowe	mary@csd4.milw.wisc.edu	      ...ihnp4!uwmcsd1!mary
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

------------------------------

Date: Fri, 8 Jan 88 09:51 EST
From: Craig Stanfill <craig@think.COM>
Subject: Re: online dictionaries


In article <1092@pbhyd.UUCP> rjw@pbhyd.UUCP (Rod Williams) writes:
>My understanding is that the online Oxford English Dictionary (OED)
>is still a work-in-progress and is not yet commercially available.

There is a new edition of the OED, which is currently in preparation,
and will eventually be available in electronic form.  There is also
the old (1932?) edition plus numerous supplements, which is available
in electronic form through Oxford University Press.


------------------------------

Date: Fri, 8 Jan 88 09:24 EST
From: Gilloux <rutgers!mimsy.umd.edu!uunet!mcvax!cnetlu!gilloux>
Subject: Seeking machine readable dictionaries in French


 I am in search of machine readable dictionaries in French.

 The intended use is to extract automatically semantic information
needed in a NL parser.

 Any help would be appreciated.

  
 Michel Gilloux
 Centre National d'Etudes des Telecommunications
 LAA/SLC/AIA
 Route de Tregastel, BP 40
 22301 Lannion Cedex
 FRANCE

 UUCP: mcvax!inria!cnetlu!gilloux

------------------------------

End of NL-KR Digest
*******************