Path: utzoo!mnetor!tmsoft!torsqnt!news-server.csri.toronto.edu!bonnie.concordia.ca!uunet!wuarchive!usc!apple!ames!haven!mimsy!mojo!buzzard From: buzzard@eng.umd.edu (Sean Barrett) Newsgroups: comp.sys.amiga.advocacy Subject: Re: Announcement--new "Unicode" standard Keywords: the kissable breasts of Yong-Mi (t.m. xanthian lust) Message-ID: <1991Feb25.154148.29542@eng.umd.edu> Date: 25 Feb 91 15:41:48 GMT References: <1991Feb24.180148.21954@ux1.cso.uiuc.edu> <1991Feb25.044130.8588@zorch.SF-Bay.ORG> <1991Feb25.080721.1628@zorch.SF-Bay.ORG> Sender: news@eng.umd.edu (C-News) Organization: Fraternity of Avian Deists Lines: 28 I think we have gotten far off from the Amiga here. *sigh* So xanthian@zorch.SF-Bay.ORG (Kent Paul Dolan) says: > > Classic_-_Concepts@cup.portal.com writes a description of Unicode > thad@public.BTR.COM (Thaddeus P. Floryan) points out it takes 16 bits > >Actually, 16 bits is insufficient; a comprehensive Chinese dictionary of >ideographs contains over 100,000 individual ones, and that is just one >language's alphabet. If the goal is portably *transmitting* characters, you can use 16 or 32 bits freely, making sure that the bit patterns you use don't have any "naughty" 8/7-bit ASCII values in them. If you don't need to be able to intermix or auto-detect what language something is, this is sufficient. > Ethan (deleted the ref, oops) is sure the people making the standards > aren't dumb, but that they'll get no support for 16 bit. > >Actually, the ASCII standard has extension mechanisms for ever larger >byte sizes; the use of 7 bit ASCII is trememdously parochial of the US. A trivial extension to ASCII (isn't this *obvious*?!!) if you need 32768 new characters is to throw out parity, leave ASCII 0-127 alone, and on ASCII 128-255 fetch the next character and combine them, giving 15 bits of new characters (or you could fetch 3 and have 31 bits, or you could have 128-191 fetch one character for 14 bits, 192-255 fetch 3 characters more for 30 bits).