Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!pacific.mps.ohio-state.edu!linac!att!ucbvax!SH.CS.NET!gwilliam From: gwilliam@SH.CS.NET (George Williams) Newsgroups: comp.protocols.tcp-ip Subject: Re: SOS: C Routines for ASCII to EBCDIC Conversion and Vice-versa Message-ID: <9106240747.AA22605@ucbvax.Berkeley.EDU> Date: 24 Jun 91 05:43:54 GMT References: <1991Jun21.091119.8500@cs.UAlberta.CA> Sender: daemon@ucbvax.BERKELEY.EDU Organization: The Internet Lines: 336 [ Disclaimer: Views and opinions, expressed or implied my own ] I agree with that which you state below. There is no questions it's best to go with a 'network-neutral' code set between disparate systems ( such as those in question ). There are a couple of points of note , however: () The mechanics of the operation are straight-forward. Declare a static array of 255 ( size of EBCDIC code set ) bytes. Then use the current character as an offset to this array to obtain the translated value. This is simple, quick, and widely used technique. The same table can be used in both direction as long as array initialization is consistent across your compute environment, i.e. all nodes used the same ( agreed on) 'network data' character sets. () There is the problem, however, ( and not just with 8859/1 ), when character defaults, "?", are used for non-display or print characters. If the original data is retrieved from the target system it is no longer intact. At least for presentation purposes. One way I have gotten around this problem in the past was to encode these characters as transparent entries in any table used. In other words the entries for these characters match the array index. There are no doubt more clever and eloquent methods, but this proved very flexible. BTW: What happened to SCS data ( codes '00'-'3f', the SNA character string for the old LU type 1 ) with this..is that what Code Page 37 addresses ? George Williams Date: 21 Jun 91 09:11:19 GMT From: Mark Israel Subject: Re: SOS: C Routines for ASCII to EBCDIC Conversion and Vice-versa In article <1991Jun20.160334.11278@gdr.bath.ac.uk>, P.Smee@bristol.ac.uk (Paul Smee) writes: > There's a basic problem here, which is that there is (still) no such > thing as EBCDIC.... For a real real definitive answer, you'll have to g et > friendly with someone in IBM. I extracted the following from local documentation. Mark Israel I have heard the Wobble! userisra@mts.ucs.ualberta.ca --------------------------------------------------------------------------- --- In February 1987, a new eight-bit ISO character set standard, 8859/1, was ratified. Also in 1987, IBM published an EBCDIC standard called Code Page 37, based on the ISO 8859/1 standard. Both of these standards contain identical character graphics. The ISO 8859/1 character set contains such EBCDIC characters as the logical-not sign and the cent sign, and the new EBCDIC character set contains the ISO tilde and circumflex, among other ASCII characters. The ISO 8859/1 standard is supported by many large computer manufacturers, including DEC and IBM. As we deal more and more with other machines using ISO-based rather than EBCDIC-based character coding schemes, it becomes imperative that we be able to move data from one machine to another and back again without loss of information. The mapping that results from using the ISO 8859/1 standard and the IBM Code Page 37 EBCDIC will allow us to move information back and forth between ISO- and EBCDIC-based machines with none of the problems we have had in the past. EBCDIC ASCII GRAPHIC DESCRIPTION --------------------------------------------------------------------- X'00' X'00' NUL null X'01' X'01' SOH start of heading (Ctrl-A) X'02' X'02' STX start of text (Ctrl-B) X'03' X'03' ETX end of text (Ctrl-C) X'04' X'9C' ? ... X'05' X'09' HT horizontal tabulation (Ctrl-I) X'06' X'86' ? ... X'07' X'7F' DEL delete (rubout, DEL control char) X'08' X'97' ? ... X'09' X'8D' ? ... X'0A' X'8E' ? ... X'0B' X'0B' VT vertical tabulation (Ctrl-K) X'0C' X'0C' FF form feed (Ctrl-L) X'0D' X'0D' X'0E' X'0E' SO shift-out (Ctrl-N) X'0F' X'0F' SI shift-in (Ctrl-O) X'10' X'10' DLE data link escape (Ctrl-P) X'11' X'11' DC1 device control 1 (X-Off, Ctrl-Q) X'12' X'12' DC2 device control 2 (Ctrl-R) X'13' X'13' DC3 device control 3 (X-On, Ctrl-S) X'14' X'9D' ? ... X'15' X'85' ? ... X'16' X'08'  BS backspace (Ctrl-H) X'17' X'87' ? ... X'18' X'18' CAN cancel (Ctrl-X) X'19' X'19' EM end of medium (Ctrl-Y) X'1A' X'92' ? ... X'1B' X'8F' ? ... X'1C' X'1C' FS file separator X'1D' X'1D' GS group separator X'1E' X'1E' RS record separator X'1F' X'1F' US unit separator X'20' X'80' ? ... X'21' X'81' ? ... X'22' X'82' ? ... X'23' X'83' ? ... X'24' X'84' ? ... X'25' X'0A' LF line feed (Ctrl-J) X'26' X'17' ETB end of transmission block (Ctrl-W) X'27' X'1B' ESC escape (Escape) X'28' X'88' ? ... X'29' X'89' ? ... X'2A' X'8A' ? ... X'2B' X'8B' ? ... X'2C' X'8C' ? ... X'2D' X'05' ENQ enquiry (Ctrl-E) X'2E' X'06' ACK acknowledge (Ctrl-F) X'2F' X'07' BEL bell (Ctrl-G) X'30' X'90' ? ... X'31' X'91' ? ... X'32' X'16' SYN synchronous idle (Ctrl-V) X'33' X'93' ? ... X'34' X'94' ? ... X'35' X'95' ? ... X'36' X'96' ? ... X'37' X'04' EOT end of transmission (Ctrl-D) X'38' X'98' ? ... X'39' X'99' ? ... X'3A' X'9A' ? ... X'3B' X'9B' ? ... X'3C' X'14' DC4 device control 4 (Ctrl-T) X'3D' X'15' NAK negative acknowledge (Ctrl-U) X'3E' X'9E' ? ... X'3F' X'1A' SUB substitute character (Ctrl-Z) X'40' X'20' space (blank) X'41' X'A0' ? no-break space X'42' X'E2' ? small a with circumflex accent X'43' X'E4' ? small a with diaeresis X'44' X'E0' ? small a with grave accent X'45' X'E1' ? small a with acute accent X'46' X'E3' ? small a with tilde X'47' X'E5' ? small a with ring above X'48' X'E7' ? small c with cedilla X'49' X'F1' ? small n with tilde X'4A' X'A2' ? cent sign X'4B' X'2E' . period, full stop X'4C' X'3C' < less-than sign X'4D' X'28' ( left parenthesis X'4E' X'2B' + plus sign X'4F' X'7C' | vertical line (bar, "or" sign) X'50' X'26' & ampersand (and sign) X'51' X'E9' ? small e with acute accent X'52' X'EA' ? small e with circumflex accent X'53' X'EB' ? small e with diaeresis X'54' X'E8' ? small e with grave accent X'55' X'ED' ? small i with acute accent X'56' X'EE' ? small i with circumflex accent X'57' X'EF' ? small i with diaeresis X'58' X'EC' ? small i with grave accent X'59' X'DF' ? small sharp s, German X'5A' X'21' ! exclamation mark X'5B' X'24' $ dollar sign X'5C' X'2A' * asterisk (star) X'5D' X'29' ) right parenthesis X'5E' X'3B' ; semicolon X'5F' X'AC' ? not sign X'60' X'2D' - minus sign or hyphen X'61' X'2F' / solidus (slash) X'62' X'C2' ? capital A with circumflex accent X'63' X'C4' ? capital A with diaeresis X'64' X'C0' ? capital A with grave accent X'65' X'C1' ? capital A with acute accent X'66' X'C3' ? capital A with tilde X'67' X'C5' ? capital A with ring X'68' X'C7' ? capital C with cedilla X'69' X'D1' ? capital N with tilde X'6A' X'A6' ? broken bar X'6B' X'2C' , comma X'6C' X'25' % percent sign X'6D' X'5F' _ low line (underscore) X'6E' X'3E' > greater-than sign X'6F' X'3F' ? question mark X'70' X'F8' ? small o with slash X'71' X'C9' ? capital E with acute accent X'72' X'CA' ? capital E with circumflex accent X'73' X'CB' ? capital E with diaeresis X'74' X'C8' ? capital E with grave accent X'75' X'CD' ? capital I with acute accent X'76' X'CE' ? capital I with circumflex accent X'77' X'CF' ? capital I with diaeresis X'78' X'CC' ? capital I with grave accent X'79' X'60' ` grave accent X'7A' X'3A' : colon X'7B' X'23' # number sign (hash mark, sharp sign) X'7C' X'40' @ commercial at X'7D' X'27' ' apostrophe (single quote) X'7E' X'3D' = equals sign X'7F' X'22' " quotation mark (double quote) X'80' X'D8' ? capital O with slash X'81' X'61' a small a X'82' X'62' b small b X'83' X'63' c small c X'84' X'64' d small d X'85' X'65' e small e X'86' X'66' f small f X'87' X'67' g small g X'88' X'68' h small h X'89' X'69' i small i X'8A' X'AB' ? angle quotation mark left (<< mark) X'8B' X'BB' ? angle quotation mark right (>> mark) X'8C' X'F0' ? small eth, Icelandic X'8D' X'FD' ? small y with acute accent X'8E' X'DE' ? small thorn, Icelandic X'8F' X'B1' ? plus or minus sign X'90' X'B0' ? degree sign X'91' X'6A' j small j X'92' X'6B' k small k X'93' X'6C' l small l X'94' X'6D' m small m X'95' X'6E' n small n X'96' X'6F' o small o X'97' X'70' p small p X'98' X'71' q small q X'99' X'72' r small r X'9A' X'AA' ? ordinal indicator feminine X'9B' X'BA' ? ordinal indicator, masculine X'9C' X'E6' ? small ae dipthong X'9D' X'B8' ? cedilla X'9E' X'C6' ? capital AE dipthong X'9F' X'A4' ? currency sign (lozenge) X'A0' X'B5' ? micro sign (small mu) X'A1' X'7E' ~ tilde (wavy line) X'A2' X'73' s small s X'A3' X'74' t small t X'A4' X'75' u small u X'A5' X'76' v small v X'A6' X'77' w small w X'A7' X'78' x small x X'A8' X'79' y small y X'A9' X'7A' z small z X'AA' X'A1' ? inverted exclamation mark X'AB' X'BF' ? inverted question mark X'AC' X'D0' ? capital D with stroke, Icelandic eth X'AD' X'DD' ? capital Y with acute accent X'AE' X'FE' ? capital thorn, Icelandic X'AF' X'AE' ? registered sign (circled capital R) X'B0' X'5E' ^ circumflex accent X'B1' X'A3' ? pound sign (Sterling currency) X'B2' X'A5' ? yen sign (Nipponese currency) X'B3' X'B7' ? middle dot (scalar product) X'B4' X'A9' ? copyright sign (circled capital C) X'B5' X'A7' ? section sign (S-half-above-S sign) X'B6' X'B6' ? pilcrow (paragraph, double-barred P) X'B7' X'BC' ? fraction one-quarter (1/4) X'B8' X'BD' ? fraction one-half (1/2) X'B9' X'BE' ? fraction three-quarters (3/4) X'BA' X'5B' [ left square bracket X'BB' X'5D' ] right square bracket X'BC' X'AF' ? macron X'BD' X'A8' ? diaeresis or umlaut X'BE' X'B4' ? acute accent X'BF' X'D7' ? multiply sign (vector product) X'C0' X'7B' { left curly bracket (left brace) X'C1' X'41' A capital A X'C2' X'42' B capital B X'C3' X'43' C capital C X'C4' X'44' D capital D X'C5' X'45' E capital E X'C6' X'46' F capital F X'C7' X'47' G capital G X'C8' X'48' H capital H X'C9' X'49' I capital I X'CA' X'AD' ? soft hyphen X'CB' X'F4' ? small o with circumflex accent X'CC' X'F6' ? small o with diaeresis X'CD' X'F2' ? small o with grave accent X'CE' X'F3' ? small o with acute accent X'CF' X'F5' ? small o with tilde X'D0' X'7D' } right curly bracket (right brace) X'D1' X'4A' J capital J X'D2' X'4B' K capital K X'D3' X'4C' L capital L X'D4' X'4D' M capital M X'D5' X'4E' N capital N X'D6' X'4F' O capital O X'D7' X'50' P capital P X'D8' X'51' Q capital Q X'D9' X'52' R capital R X'DA' X'B9' ? superscript one X'DB' X'FB' ? small u with circumflex accent X'DC' X'FC' ? small u with diaeresis X'DD' X'F9' ? small u with grave accent X'DE' X'FA' ? small u with acute accent X'DF' X'FF' ? small y diaeresis X'E0' X'5C' \ reverse solidus (backslash) X'E1' X'F7' ? divide sign (dot over line over dot) X'E2' X'53' S capital S X'E3' X'54' T capital T X'E4' X'55' U capital U X'E5' X'56' V capital V X'E6' X'57' W capital W X'E7' X'58' X capital X X'E8' X'59' Y capital Y X'E9' X'5A' Z capital Z X'EA' X'B2' ? superscript two (squared) X'EB' X'D4' ? capital O with circumflex accent X'EC' X'D6' ? capital O with diaeresis X'ED' X'D2' ? capital O with grave accent X'EE' X'D3' ? capital O with acute accent X'EF' X'D5' ? capital O with tilde X'F0' X'30' 0 digit zero X'F1' X'31' 1 digit one X'F2' X'32' 2 digit two X'F3' X'33' 3 digit three X'F4' X'34' 4 digit four X'F5' X'35' 5 digit five X'F6' X'36' 6 digit six X'F7' X'37' 7 digit seven X'F8' X'38' 8 digit eight X'F9' X'39' 9 digit nine X'FA' X'B3' ? superscript three (cubed) X'FB' X'DB' ? capital U with circumflex accent X'FC' X'DC' ? capital U with diaeresis X'FD' X'D9' ? capital U with grave accent X'FE' X'DA' ? capital U with acute accent X'FF' X'9F' ? ...