Path: utzoo!mnetor!uunet!husc6!ut-sally!utah-cs!defun.utah.edu!shebs From: shebs%defun.utah.edu.uucp@utah-cs.UUCP (Stanley T. Shebs) Newsgroups: comp.lang.prolog Subject: Re: character type (was Re: BSI standards) Message-ID: <5341@utah-cs.UUCP> Date: 11 Mar 88 16:40:19 GMT References: <8803082357.AA01587@decwrl.dec.com> <5334@utah-cs.UUCP> <5337@utah-cs.UUCP> <756@cresswell.quintus.UUCP> Sender: news@utah-cs.UUCP Reply-To: shebs%defun.utah.edu.UUCP@utah-cs.UUCP (Stanley T. Shebs) Organization: PASS Research Group Lines: 32 In article <756@cresswell.quintus.UUCP> ok@quintus.UUCP (Richard A. O'Keefe) writes: What he says. O'Keefe makes some strong points about the complexities of case conversion in various languages. I suppose case conversion on individual characters is too prevalent to drop it in favor of "word" or "sentence" case conversion! I haven't done any significant text processing in CL, so can't comment on the "correct" practice. >I think the Common Lisp character abstractions aren't quite right either. >Trying to treat control-super-hyper-X as a single character is not quite >right. (How many versions of Common Lisp have (> (char-bits-limit 1)) ?) The real reason for having "char-bits" in CL has more to do with a certain Lisp company than with sound technical reasons. Thus the "fancy" characters are not required to be storable into strings, which limits their usefulness! Still, most commercial CL impls *do* have (> char-bits-limit 1), but the main reason seems to be that there are usually about 24 bits available in the standard representations, but only 7 are actually needed for a code, so there's nothing to lose by saying that some of the remaining bits are "char-bits". All pretty sad, actually... >If characters are represented by integers, then it is straightforward >to program up missing operations. If characters are a separate data type, >but that data type is missing many of its "natural" operations, then you >wind up with murky code changing types all over the place. A separate data type with conversion functions doesn't imply murky code, if the missing operations have been written to keep all the murkiness to themselves. stan shebs shebs@cs.utah.edu