Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!cs.utexas.edu!ico!vail!rcd From: rcd@ico.ISC.COM (Dick Dunn) Newsgroups: comp.lang.c Subject: Re: need EBCDIC to ASCII function Summary: well, it was and wasn't EBCDIC Message-ID: <16204@vail.ICO.ISC.COM> Date: 13 Oct 89 05:10:44 GMT References: <1060@einstein.misemi> <1989Oct4.203729.11700@utzoo.uucp> <10946@riks.csl.sony.co.jp> Organization: Interactive Systems Corp, Boulder, CO Lines: 37 diamond@csl.sony.co.jp (Norman Diamond) writes: > ...henry@utzoo.uucp (Henry Spencer) writes: > >You will have to be more specific. Which flavor of EBCDIC? EBCDIC is > >not a single well-defined character code, but a family of somewhat-similar > >codes... > In fact EBCDIC is just as well-defined as ASCII. Only some IBM print > trains did not use EBCDIC... I think it's not quite this simple. Haul out your trusty yellow card (that's the successor to the green card, right?:-) and look at the "Code Translation Table." You will see a pair of columns labeled "EBCDIC(1)". It is this pair of columns (at least) which give rise to Henry's comment about "somewhat-similar codes" and Norman's comment about print trains. However, if you read the footnote (1) referenced by the column heading, you see: "Two columns of EBCDIC graphics are shown. The first gives standard bit pattern assignments. The second shows the T-11 and TN text printing." In other words, there are two forms of EBCDIC here (Henry's point), but one of them is standard (Norman's point). Ouch! Dumb! Keep in mind that this wasn't the result of some dispute among vendors; IBM didn't get it right among themselves. What does this mean to an implementor? There are some interesting impli- cations here! On input, you need to accept whatever you get. On output, you want to produce the codes that print right. If you're parsing input text (I fell into this while working on a Pascal compiler), you can simply accept both codes for characters which differ. (There aren't any conflicts which matter; there are lots of holes in the codesets.) But for character and string constants, you gotta generate the codes you're given, which means that the code for left bracket may not be equal to the code for left bracket (if they were from different flavors of EBCDIC)! Worse, the character that prints correctly isn't the "standard" one--so you can either have the program listing (remember those?:-) look right, or get the right answer! -- Dick Dunn rcd@ico.isc.com uucp: {ncar,nbires}!ico!rcd (303)449-2870 ...No DOS. UNIX.