Xref: utzoo comp.unix.programmer:1593 comp.lang.perl:4936 comp.std.internat:855 Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!bu.edu!wang!news From: rschwartz@OFFICE.WANG.COM (R. Schwartz@Wang R&D Net) Newsgroups: comp.unix.programmer,comp.lang.perl,comp.std.internat Subject: Re: Tools for manipulating message catalogs Message-ID: Date: 16 Apr 91 15:24:35 GMT Sender: news@wang.com Reply-To: rschwartz@office.wang.com Organization: Mail to News Gateway Lines: 126 eliot@chutney.rtp.dg.com (Topher Eliot) writes: > (much omitted) > > The moral of this is that ONE SHOULDN'T DO THINGS THAT REQUIRE MAINTAINING > SYNCHRONIZATION BETWEEN THE APPLICATION AND THE MESSAGE CATALOG, like > inserting a new message into the middle of an existing message catalog. > > (more omitted) > > You should never WANT automatic numbering of your messages. > > (still more omitted) > > Have I made my point clear? Would anyone care to point out flaws in my logic > Does anyone still think that a tool to create a .h file out of a message > catalog is useful? YES!!! Your point is clear. YES!!! I absoultely insist that generating .h files is required. The flaws are not in your logic. The flaws are in your assumptions about the tools that should be used to synchronize code and messages when they reside in separate files. I.e., you presume that there are no such tools, and I grant that it is normal for there to be none. The dangers that you point out are completely valid, and your point that these dangers are exacerbated by the logistics involved in sending materials hither and yon for translation is well taken. But the solution isn't to make a bad software engineering decision. Invent the right tools instead! The use of mnemonic names in message catalogs is an absolute necessity in any application other than trivial toys. Most of the benefits are too obvious to mention. One that bears special attention is the ability to re-organize multiple catalogs without re-numbering. If the run-time organization of code changes from one release to the next, it may make perfectly good sense to divide or merge message catalogs, or to re-locate individual messages. Mnemonic labels can minimize the code impact of such changes. I might even suggest going to enough lengths to remove the code impact completely by adding a level of indirection so that code is unaware which message catalog a given message comes from. Another point that strongly supports the use of such a tool is that it helps translators to identify their mistakes. Comparison of the .h file generated with the translated catalog against the version from the release is a sure way to detect inadvertantly deleted messages and a host of other errors. I haven't met a translator who wouldn't love to have a way to check for such editing errors. Something to help us developers, too: tracking down obsolete messages is a snap if you use a cross-referencer to find unused #defines in your generated .h files. Maybe it's really obsolete and should be gotten rid of since translating obsolete messages to a dozen or so languages can cost big bucks, pounds, marks, yen, etc. Maybe you added an error message to the catalog you knew you'd need it, but you forgot to code that else clause! Am I reaching? Am I stretching my logic to make a point? Yup! But does anyone still think that a tool to create a .h file out of a message catalog is useless? :-) erik@srava.sra.co.jp (Erik M. van der Poel) writes: > Using numbers for the message ids was a bad idea in the first place. > (Thank goodness XPG3 and AT&T's specs are not International > Standards.) Once compiled into an executable, no one need care what the representation of a message id is. Nobody says that the the #define in the generated header ultimately has to resolve to an integer. It merely has to resolve to whatever the functional interface requires, and if that changes you just change the .h generation tool. Information hiding strikes again! > Wouldn't it be possible to create a reasonably efficient > implementation using hashing and caching with symbolic names instead > of numeric ids? Then we can add/delete/modify messages at will. We > should leave numbering and counting to the computer. Yes it is possible, but why bother? The organization of the run-time store of messages can be changed for efficiency without any impact on the functional interface. As an example, I have implemented a (non-unix based) system that compiles the (equivalent of) the message catalog into assembler code for a function that retrieves the messages from (again the equivalent of) the text segment of a shared runtime archive. The performance is frighteningly good, and I don't do any fancy indexing or hashing. I could add it, but for a large-scale multi-user application the big bang for the buck was in reducing paging by using non-modifiable shared memory instead of data space. Yes, it just uses integer ids, and yes, it generates the headers. nazgul@alphalpha.com (Kee Hinckley) writes: > In addition there is a way to, if not prevent > the problem, at least spot it. Simply have a convention, as a user > of message catalogs, that messageId #1 is a version number. Every > time you make an incompatible change to the catalog, change the version > number. Have your application check the version number and complain > if it doesn't match. More than that, have it check for the last and one-past-the-last message to verify that the catalog has exactly the right number of entries. Don't tolerate any errors in the message configuration -- they're just as critical as errors in configuration of executables. Just don't take a checksum! :-) If you want real safety, make the versioning mechanism automatic. Have your make file bump it after any change that affected the .h file, and drop the new version number into both the message cat and the .h. Have your code do its version check comparing the run-time version against a symbolic constant from the very same include file! A re-compile of the code that includes the .h is forced anyhow, so the code is always in step with the message catalog version. Now, provide a modified version of the make file for your translators that does the same checking but instead of triggering a bump in version and re-compile (you don't give them source anyhow) it simply triggers an error. A final comment: The main reason that I am concerned about this is that internationalization of code must not violate developers' sense of what is right. The only people I have run into who are more fanatic than non-English speakers who (rightly) flame against non-translatable code, are developers who (rightly) flame against un-readable code. There is finally real recognition of the need for designing internationalization in applications from Day One, and this has been a hard-fought victory. Let's not make the software so ugly that everyone will go back to the old attitude of "we'll worry about international in release 2". rich schwartz (All views expressed are my own, and not Wang Labs, Inc.'s.). rschwartz@office.wang.com VOICE (508) 967 5027 FAX (508) 967 0947m. Wang Labs, Inc., M/S 019-58A, 1 Industrial Ave., Lowell, MA 01851