Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.2 9/18/84 (Fortune 01.1b1); site graffiti.UUCP Path: utzoo!linus!decvax!ucbvax!ucdavis!lll-crg!seismo!ut-sally!ut-ngp!shell!graffiti!peter From: peter@graffiti.UUCP (Peter da Silva) Newsgroups: net.internat Subject: Re: character sets Message-ID: <309@graffiti.UUCP> Date: Tue, 15-Oct-85 20:13:26 EDT Article-I.D.: graffiti.309 Posted: Tue Oct 15 20:13:26 1985 Date-Received: Fri, 18-Oct-85 20:00:47 EDT References: <719@inset.UUCP> <214@rtp47.UUCP> Organization: The Power Elite, Houston, TX Lines: 23 7000 Japanese characters... hmmm... How about using the 8th bit set to indicate that this byte and the following encode one of 32767 extended characters. (1xxxxxxx 0000000 is illegal here) In ASCII text file or stream: Normal ASCII character: 0xxxxxxx Foreign character: 1xxxxxxx xxxxxxxx In memory or foreign file: Normal ascii character: 00000000 xxxxxxxx Foreign character: 1xxxxxxx xxxxxxxx Null: 00000000 00000000 Two ASCII characters: 0xxxxxxx 0xxxxxxx So an ASCII text file is a compressed form of the foreign file. The ascii character pair should probably be used with caution. Maybe it should just be undefined. If this suggestion has already been made (and it probably has, it seems a pretty obvious way of doing things to me...) just pretend I'm not here.