Path: utzoo!attcan!uunet!husc6!bloom-beacon!gatech!ncar!noao!arizona!naucse!jdc From: jdc@naucse.UUCP (John Campbell) Newsgroups: comp.os.vms Subject: Re: Flex and DEC multi-nationals, help! Summary: ?? Sorry, here's what I did... Keywords: "char" is signed. Message-ID: <720@naucse.UUCP> Date: 24 May 88 21:59:02 GMT References: <8546@dartvax.Dartmouth.EDU> Organization: Northern Arizona University, Flagstaff, AZ Lines: 52 I'm posting this because I could not find a uucp path to eleazar, if anyone is interested in flex esoterica, read on. In article <8546@dartvax.Dartmouth.EDU>, earleh@eleazar.dartmouth.edu (Earle R. Horton) writes: > I am trying to port Flex to Macintosh Programmer's Workshop C, which > like VAX C treats characters as signed. Has anyone in this part of > the world had any luck making Flex scan DEC multi-national characters > properly? > > The best I have been able to do so far is to get Flex and its scanners > to pass [200-377] unchanged, but ... Well, I wanted to do the same thing using the original lex (I think the following will hold for flex as well). The best I could do was fold the upper bit stuff back to 7 bit ascii and then build patterns that worked on the 7 bit representation (I wanted of course). The macro stuff looked something like the following (lex fragment). : %{ : #define NewEOF 127 : : /* Change lex's input to allow us to think csi (9b) is esc (1b). */ : # define input() (((yytchar=yysptr>yysbuf?U(*--yysptr):getc(yyin)&0x7f)\ : ==10?(yylineno++,yytchar):yytchar)==NewEOF?0:yytchar) : : /* Done with lex substitution. */ : %} : csi "\033" : eseq1 {csi}[ -/]*[0-~] : eseq2 {csi}\[[0-?]*[ -/]*[@-~] : eseq3 {csi}[0-?]*[ -/]*[@-~] : %% : {eseq1} {/* Ignore */ } : {eseq2} {/* Ignore */ } : {eseq3} {/* Ignore */ } Note that flex doesn't have the same tables feature as lex, but I couldn't extend the lex tables anyway. Building a special version of flex that can handle 255 character tables might not be too hard--if I am right, you are getting hit because flex assumes 127 characters in its character table. You might try playing with CSIZE in flexdef.h (defined as 127). I'm not sure if this will impact other values (like INITIAL_MAX_CCL_TBL_SIZE, etc.). A note to Vern Paxson (ucbvax!lbl-csam.arpa!vern) regarding the impact of making this change and a plea for supporting character sets greater than 127 may even be reasonable. As stated above, *sorry* I don't have the final answer, but I do sympathize. -- John Campbell ...!arizona!naucse!jdc unix? Sure send me a dozen, all different colors.