Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!uunet!husc6!rutgers!mcnc!ece-csc!ncrcae!ncr-sd!crash!gryphon!tsmith From: tsmith@gryphon.CTS.COM (Tim Smith) Newsgroups: sci.lang,comp.ai,comp.ai.neural-nets Subject: Re: Infinite alphabets - (Turing via Berke) Message-ID: <1923@gryphon.CTS.COM> Date: Fri, 16-Oct-87 04:17:57 EDT Article-I.D.: gryphon.1923 Posted: Fri Oct 16 04:17:57 1987 Date-Received: Sat, 17-Oct-87 19:04:28 EDT References: <154@Aragorn.UUCP> <114400001@exunido.UUCP> Reply-To: tsmith@gryphon.CTS.COM (Tim Smith) Organization: Trailing Edge Technology, Redondo Beach, CA Lines: 77 Xref: mnetor sci.lang:1569 comp.ai:904 comp.ai.neural-nets:14 In article <971@kodak.UUCP> bayers@kodak.UUCP (mitch bayersdorfer) writes: +===== | I don't mean to make a reductio ad adsurdum, but can any alphabet which | a given human can perceive be truely infinite? Given that human beings have | a finite (but very large) number of neurons in their visual and cerebral | cortex, and that any distinct alphabetic character would exceed the | thresholds of an enumerable permutation of those neurons, mustn't there | be only a finite number of characters (and concepts?). I am assuming that | neural thresholds for a certain learned concept are relatively constant. If | another concept produces the same permutation, but only to a greater | degree, then couldn't that be reduced down to two concepts-- one a measure | of the concept, and the other a measure of its degree? | - Mitch Bayersdorfer +===== You are addressing a problem that linguists have had to wrestle with for a long time. Back in the 1950's, the linguist Noam Chomsky did some original research on the properties of formal grammars that might be used to generate sentences in a natural language. Among other things, he discovered that any reasonable grammar (or production system) to generate sentences in natural language would have recursive properties that would, in effect, make the set of sentences of the language infinite in size. The corollary to this, of course, is that there is no "longest sentence" in a natural language. Realizing that it is somewhat silly to claim that a human can utter or comprehend a sentence that might be a few giga-centuries in duration, Chomsky fudged around and invented a distinction between linguistic "competence" and linguistic "performance". Actually, it's a reasonable fudge. A good analogy for this is simple arithmetic. If you know the rules of, say, multiplication, there is no reason why you can't multiply two numbers that require a few giga-light-years worth of paper to write down (in 10-point type). You are theoretically competent to do this. Your performance abilities (life span, attention span, etc.) will, however, undoubtedly keep you from achieving the end product. Cantor discovered several kinds of infinities. Perhaps what we need to discover is a kind of sub-infinity. The actual number of sentences in any natural language is "infinite" in the same sense that the number of grains of sand on a beach is infinite, i.e. "not really". Are alphabets like sentences? Uh, no, not at all! I have stayed out of this discussion about "infinite alphabets", since I do not understand the issue. I enter reluctantly. Here goes... An alphabet is a set (decidedly finite) of glyphs (e.g., little marks on paper). A language adopts an alphabet, and tries to map its sound system (its phonology) onto the alphabet. Sometimes this works well, sometimes not so well. For example, both Italian and English use the Latin alphabet. Neither language's phonology maps directly to the alphabet, but Italian maps better than English. In Italian there are only a few little problems (the letters "c", "g", "e", and "o" do not map one-to-one to Italian phonemes). In English, there are many, many problems (which we are all aware of). The sound system of every language contains a finite number of phonemes, and a finite (but much larger) number of syllables. Therefore, any alphabet (phomeme-glyph mapping), or any syllabary (syllable-glyph mapping) should be finite. Ideographic writing systems, such as Japanese "kanji", are not alphabets, and they are not syllabaries. If we assume that the languages that use ideographic writing systems allow free creation of new glyphs, then these writing systems have an infinite number of glyphs, in exactly the same sense that there is an infinite number of sentences in English, or that there is an infinite number of art works that can be created by mankind. -- Tim Smith INTERNET: tsmith@gryphon.CTS.COM UUCP: {hplabs!hp-sdd, sdcsvax, ihnp4, ....}!crash!gryphon!tsmith UUCP: {philabs, trwrb}!cadovax!gryphon!tsmith