Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!watmath!clyde!burl!ulysses!allegra!mit-eddie!genrad!panda!husc6!harvard!caip!seismo!mcvax!vmucnam!imag!vandome From: vandome@imag.UUCP (Gerard Vandome) Newsgroups: net.internat Subject: International Character Message-ID: <674@imag.UUCP> Date: Wed, 30-Apr-86 17:54:18 EDT Article-I.D.: imag.674 Posted: Wed Apr 30 17:54:18 1986 Date-Received: Sun, 4-May-86 05:36:59 EDT Reply-To: xopen@echbull.UUCP (Pascal Beyls) Organization: IMAG, Un. of Grenoble, France Lines: 43 I would like to clarify the definition of an "international character". First, becareful with some words such as : character, char, byte, integer, letter, string ... For example, SVID 1 indicates in GETC(BA_LIB) that : "the function getc returns the next character (i.e., byte) ... " although its definition is : int getc(stream). Secondly, using ISO 646 (US ASCII), no problems arise because of a correspondance between byte and character. The fact that the result is an integer and not an unsigned integer (as expected) allows the test of EOF (generally -1). Consider the following problem in CONV(BA_LIB) : int toupper(c) (with int c) called, for example, in ISO 8859/1 with character c = ll must return Ll in Spanish. In an international version of UNIX, what should be a "character" ? with ISO 8859/1 (latin 1) code with CCITT (teletext) code where acharacter may be constituted by a diacritical sign followed by a letter with JIS 6226 (japanese) code where a character stands on 2 bytes QUESTIONS: - What is the size in bytes of a character ? - Is that question a real question? - Are double letters such as "ij" in Dutch or "'e" in teletext code considered as one character? - Is an international character a signed or an unsigned character? I will be pleased to receive yours comments on this topic. Pascal BEYLS BULL France EUNET : mcvax!vmucnam!echbull!xopen