Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!think.com!zaphod.mps.ohio-state.edu!cis.ohio-state.edu!tut.cis.ohio-state.edu!ucbvax!ENG.SUN.COM!Mitch.Bradley From: Mitch.Bradley@ENG.SUN.COM Newsgroups: comp.lang.forth Subject: Re: ZEN 15A Message-ID: <9105240411.AA06297@ucbvax.Berkeley.EDU> Date: 23 May 91 19:54:45 GMT Sender: daemon@ucbvax.BERKELEY.EDU Reply-To: Mitch.Bradley%ENG.SUN.COM@SCFVM.GSFC.NASA.GOV Organization: The Internet Lines: 63 > Ok. So I decided to convert my full-screen block editor to ZEN. I didn't > finish the conversion but I found some interesting problems. > > The first problem (not a big deal) is -->. : --> REFILL DROP ; REFILL is a generalization of --> and QUERY that works on any input source, returning a flag indicating whether or not the input buffer could be refilled. > The second problem is AT-XY. > It's defined as DROP SPACES. I think it's better leave AT-XY out of ZEN than > defining it this way. Tough call. I can see both sides of this argument. > The third problem, the one that stopped me is with 8-bit characters. > > My editor is provided with an accent filter, so I can enter all portuguese > characters easily. Well, to make my program more legible, I did things like: > > CHAR ^ CONSTANT _portuguese_name_of_the_accent_ > > But this don't work with ZEN. The CHAR leaves nothing on the stack, the > accent is ignored (not this, the >127 ones), and CONSTANT is ignored. The > word after constant is then interpreted, returning an error. I have two > questions about this: > 1) What is happening? Something in the input stream mechanism is filtering out non-standard characters. My copy of zen15 is on another machine, so I can't say for sure, but I would guess that the problem is either in EXPECT or in the "skip delimiters" portion of WORD. You might try searching for "127" in the source code. > 2) Should this happen on an ANS Forth? May this happen on an ANS Forth? > (BASIS 15, of course) It certainly *may* happen. The character set for the source code of a standard program is the set of printable characters in the 7-bit ASCII set. Use of any other character causes the program to have an environmental dependency (in this case, on the system supporting the Portugese character set). This seems reasonable to me; your program source code would not look very legible on my system, since I don't have Portugese characters. The question about whether or not it *should* happen is a matter of market economics. The implementor can choose whether try to support extended character sets, or whether to accept only the standard set. Since Zen is a fairly minimal system without many environment-dependent extensions, and is intended to illustrate the basics of ANS Forth, I'm not surprised that Martin chose to restrict the character set. I expect that successful commercial systems will adopt a different approach, supporting international character sets where available. A standard ANS Forth system is not required to reject non-printable characters in blocks, nor is it required to accept them. The characters whose meanings are precisely defined in the context of block source code are the space character and the ASCII characters with codes from 33 to 126. Mitch.Bradley@Eng.Sun.COM