Xref: utzoo comp.emacs:4521 comp.lang.c:13686 comp.sys.ibm.pc:20719 Path: utzoo!utgpu!water!watmath!clyde!att!osu-cis!tut.cis.ohio-state.edu!mailrus!iuvax!pur-ee!j.cc.purdue.edu!nwd From: nwd@j.cc.purdue.edu (Daniel Lawrence) Newsgroups: comp.emacs,comp.lang.c,comp.sys.ibm.pc Subject: Re: Programming and international character sets. Summary: 8 bit chars in uEMACS Keywords: 8 bit characters Message-ID: <8045@j.cc.purdue.edu> Date: 31 Oct 88 14:54:43 GMT References: <532@krafla.rhi.hi.is> Reply-To: nwd@j.cc.purdue.edu.UUCP (Daniel Lawrence) Organization: Purdue University Lines: 48 In article <532@krafla.rhi.hi.is> kjartan@rhi.hi.is (Kjartan R. Gudmundsson) writes: > >How difficult is it convert american/english programs so that they can >be used to handle foreign text? The answer of course depends on the language [a description of some of the problems using 8 bit chars] > >Let's look at some code from MicroEMACS: > [a code excerpt from MicroEMACS 3.9] >Ugly isn't it? > Ok, I am feeling a little picked on here... a lot of people like using uEMACS for pointing things like this out. When I first started working with it, it was just for me. But that is really no excuse... >An other way of doing this is using "is.." functions that are [an alternative which is better] >This code is better (most of the is.. things are macros that mask [More descriptions of 8 bit problems...] And someone finally proposes some solutions rather than just blindly stabbing out and complaining. The last round of complaints I sent out a request for information on this problem, and the best I got back was.. go to the library and do some research. Well for a project I am doing in my spare time, considering the poor library system round here I really wasn't happy to here all the griping and then get no help from anyone to fix the problems. So I applaud Mr. Gudmundsson for his mail. ># Kjartan R. Gudmundsson # ># Raudalaek 12 # ># 105 Reykjavik # Internet: kjartan@rhi.hi.is # However, after the last round, I thaought carefully about the 8 bit problems, and resolved that the issue was too complex on a language by language basic for me to ever attempt to get all the case mappings correct. So when you see the next version of MicroEMACS, it will have a user changable upper/lowercase mapping function (which is working right now). Note: This slows down the regular pattern matching code considerable, so uEMACS can be compiled with the diacritical (un american in this case) turned off, but both options now exits. Daniel Lawrence (317) 742-5153 UUCP: {pur-ee!}j.cc.purdue.edu!nwd ARPA: nwd@j.cc.purdue.edu FIDO: 1:201/10 The Programmer's Room (317) 742-5533