Path: utzoo!utgpu!news-server.csri.toronto.edu!mailrus!cs.utexas.edu!samsung!munnari.oz.au!goanna!ok From: ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) Newsgroups: comp.lang.c Subject: Internationalisation (was: NULL as a string terminator) Message-ID: <3603@goanna.cs.rmit.oz.au> Date: 22 Aug 90 08:02:12 GMT References: <24141@megaron.cs.arizona.edu> <134@blekko.UUCP> <1881@jura.tcom.stc.co.uk> Organization: Comp Sci, RMIT, Melbourne, Australia Lines: 72 In article <1881@jura.tcom.stc.co.uk>, rmj@tcom.stc.co.uk (Rhodri James) writes: > In article <3585@goanna.cs.rmit.oz.au> ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) writes: > }For why? Internationalisation, _that's_ for why. > I cringe when I see this (unwords like "internationalisation", I mean). One uses language for the purpose of communication. In order to effect that purpose, one uses words that other people know and use, not the words one happens to like. Like it or not, "internationalise" and its derivatives are *words* in 1990s computing jargon. Perhaps Rhodri James may have a better term that is a miracle of euphony and clarity; well for heaven's sake tell us what it is *now* and let's get pushing it, for "internationalisation" bolted from its stable long ago. (By the way, there is no such word as "unword". If there were such a term, it would be "nonword". "dictcheck -pedantic") > Also I fail to see your point. Surely such #ifdef switching > as above is more efficient, simpler to maintain and more legible than > the scrabbling about with resource files you prefer? So now Cn James reads minds and knows what I prefer. Wonderful just. No, it is *not* simpler to maintain. The point of the resource file approach (not my invention by any means; no-hopers like IBM, DEC, HP, X/Open, AT&T, Apple, ... have been using it for a while and I just copied the idea and simplified it a bit for this newsgroup) is that you have all the text in one place; you don't have to go "scrabbling about" in the source files to find all the strings. You can give the resource file to a human translator who knows nothing about the programming language you are using. A minor addition to such a tool (have it generate INTEGER MSGNO PARAMETER (MSGNO=...... instead of #defines) will let you use the *same* message file with a Fortran program. Speaking as a no-hoper, I must admit that using a technique that adapts to *all* the programming languages I use, not just C, sounds like a saving. But what do I know? As for efficiency, the point is that we are talking about a scheme for generating messages for display to humans. The cost of fishing the text out of a file is (or was every time I measured it) considerably less than the cost of displaying it on the terminal. The real schemes (such as the X/Open one) identify messages by numbers, not by address in the text file. That has the disadvantage that finding the right text is a wee bit more complex (but not very; one need merely attaches a directory at the end of the file), but it has the great advantage that the program does not need to be recompiled. This means that one customer can be running the program with messages coming from the "English-speaking idiot" message file and another with messages coming from the "Spanish-speaking wizard" message file, and both can be sharing the same copy of the program without any recompilation at all. That's the way it *is* in UNIX System V Release 4. We might as well get used to thinking about messages in that way now. > Demonstrate to me a negative impact on internationalisation (ugh) and I > might believe you. Any negative impact will do, I'm not too choosy. The schemes actually used by IBM (MVS, CMS, AIX) HP (HP-UX), DEC (VMS, Ultrix), AT&T (SVR4) and others essentially add another couple of layers of indirection above what I presented. Those systems all allow you to switch languages at run time, without any recompilation. Those systems all allow you to translate message files without having any other access to the sources. They all allow many programs, and many programming languages, to share the same message files. They all allow a customer to substitute his own translation of a message file (perhaps amplifying some messages, or getting the grammar right, or ...) without access to the sources. There's four negative impacts of the #ifdef approach, just for starters. -- The taxonomy of Pleistocene equids is in a state of confusion.