Xref: utzoo comp.unix.programmer:1540 comp.lang.perl:4856 comp.std.internat:830
Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!usc!cs.utexas.edu!sun-barr!newstop!sun!amdcad!dgcad!dg-rtp!chutney!eliot
From: eliot@chutney.rtp.dg.com (Topher Eliot)
Newsgroups: comp.unix.programmer,comp.lang.perl,comp.std.internat
Subject: Re: Tools for manipulating message catalogs
Message-ID: <1991Apr10.122642.3991@dg-rtp.dg.com>
Date: 10 Apr 91 12:26:42 GMT
References: <1991Apr7.190119.24825@motcad.portal.com> <1991Apr8.191035.13836@alphalpha.com>
Sender: usenet@dg-rtp.dg.com (Usenet Administration)
Reply-To: eliot@dg-rtp.dg.com
Organization: Data General Corporation, Research Triangle Park, NC
Lines: 72

On more than one occasion I have seen references to tools that accept as
input a message catalog that contains symbolic names (rather than numbers) 
for the messages, and produce as output new message catalogs containing
numbers (or perhaps compiled message catalogs) and .h files containing the
appropriate #define lines mapping the symbolic names to the numbers. I am
here to argue that these are a Bad Thing.

At first glance, they seem great.  Who wants to keep track of a bunch of small
integers, when they can use symbolic identifiers instead?  I mean, we figured
that out back when the first assembler was written.

The problem is the context in which such a tool is being used.  One of the 
main points of symbolic names (especially #defined "constants") is that they
allow one to change the numeric value of the "constant" without having to
edit all the source files.  Thus, for example, one could add a new message
to the middle of a message catalog, rebuild everything, and it would all be
in sync.  Or so it would appear.

But what's the point of message catalogs?  The point is that you don't just
have one, you have lots, in all different languages.  Creating new versions
of those translated catalogs is NOT just a matter of rebuilding.  They have
to be sent off to translators, and then reincorporated into the product
distribution after being translated.  They may arrive at different times
(getting something translated into French can probably be done locally;
Serbo-Croation is more of a challenge).  Depending on how you distribute them
to customers, they may or may not arrive in sync with the new executable
code.  Customers may or may not load all the new message catalogs.  On and on.
All in all, keeping message catalogs synchronized with programs that use
them is a real bitch.

The moral of this is that ONE SHOULDN'T DO THINGS THAT REQUIRE MAINTAINING
SYNCHRONIZATION BETWEEN THE APPLICATION AND THE MESSAGE CATALOG, like
inserting a new message into the middle of an existing message catalog.
One should only add new messages to the end of a catalog.  If a message is
no longer required, it's place should be filled with a zero-length message
(or just not be used, depending on whether you are using AT&T or Xopen message
facilities).  That slot (message number) should not be re-used for a different
message.

Given these guidelines, the usefulness of the tool I described above is much
less than one might initially think.  In fact, I argue that such a tool tempts
one to break the guidelines, or perhaps I should say makes it easy to break
the guidelines without realizing that one has done so.  Without the tool,
one writes the message number right into the application program, and leaves
that value there forever.  Which is exactly what one should do.  Presumably
if one types in the wrong number, this error will be discovered early on in
testing.  (You do, after all, test each possible message usage, don't you? :-)

To reiterate:  when one is writing an application, every time one creates a
new message, it should be added to the message catalog, a message number should
be created for it, that message number should be hard-coded into the
application source code, and then it should stay that way until doomsday.
You should never WANT automatic numbering of your messages.

Some people may point out that using symbolic identifiers for messages allows
a reader of the source code to figure out easily what the message is, rather
than having to flip back and forth through a message catalog.  I would counter
that the source code is supposed to have a compiled-in default message
anyway, to cover those occasions when the message catalog is for some reason
unavailable.  Given the default message, a symbolic message identifier
doesn't add much.

Whew.  And so early in the morning, too.

Have I made my point clear?  Would anyone care to point out flaws in my logic?
Does anyone still think that a tool to create a .h file out of a message
catalog is useful?

-- 
Topher Eliot                           Data General DG/UX Internationalization
(919) 248-6371        62 T. W. Alexander Dr., Research Triangle Park, NC 27709
eliot@dg-rtp.dg.com                           {backbone}!mcnc!rti!dg-rtp!eliot
Obviously, I speak for myself, not for DG.