Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!utgpu!utcsri!greg From: greg@utcsri.UUCP Newsgroups: comp.arch,comp.lang.c Subject: Re: String Processing Instruction Message-ID: <4653@utcsri.UUCP> Date: Thu, 23-Apr-87 10:11:05 EST Article-I.D.: utcsri.4653 Posted: Thu Apr 23 10:11:05 1987 Date-Received: Sat, 25-Apr-87 01:41:27 EST References: <15292@amdcad.UUCP> <693@jenny.cl.cam.ac.uk> <4605@utcsri.UUCP> <299@pembina.UUCP> Reply-To: greg@utcsri.UUCP (Gregory Smith) Organization: CSRI, University of Toronto Lines: 71 Xref: utgpu comp.arch:1005 comp.lang.c:1787 Summary: In article <299@pembina.UUCP> bjorn@alberta.UUCP (Bjorn R. Bjornsson) writes: >In article <4605@utcsri.UUCP>, greg@utcsri.UUCP writes: >>In article <693@jenny.cl.cam.ac.uk> am@cl.cam.ac.uk (Alan Mycroft) writes: >>> #define has_nullbyte_(x) ((x - 0x01010101) & ~x & 0x80808080) >>>Then if e is an expression without side effects (e.g. variable) >>> has_nullbyte_(e) >>>is nonzero iff the value of e has a null byte. >> >>If one of the bytes contains 0x80, then has_nullbyte() will [BUNK] >>say 'yes'. This can be circumvented by a more thorough test [BUNK] >>after this one to see if there really is a null there. [BUNK] > >You're mistaken the "has_nullbyte_(x)" defined above works >as advertised for all 32 bit x. That is to say it returns >a nonzero result if and only if x contains a null byte. Right. I apologize for any inconvenience I may have caused anybody. I also apologize for posting bunk in a sleepy haze and then vanishing for a week. > >>Someone posted a similar but usually better test on comp.arch ( I think ) >>a little while ago: >> >>#define has_nullbyte(e) ((e+0x7efefeff)^e)&0x81010100) != 0x81010100 >> >>This one is only 'fooled' by an 0x80 in the most significant byte. >>which makes the following test much simpler ( a sign test ). > >You are right this one does not always tell the truth. Besides >it's a lot more effort. I seems to be about the same, in the loop. Below is the subtract version, and the add version on an NS32k: movd 0(r0),r1 subd r2,r1 bicd 0(r0),r1 andd r3,r1 ;r2 & r3 contain constants cmpqd $0,r1 ;need this on ns32k movd 0(r0),r1 addd r2,r1 xord 0(r0),r1 andd r3,r1 cmpd r3,r1 I can't remember why I thought it was more efficient. I guess it is if you haven't got 'bic'. ( and if 'and' doesn't set the Z flag :-) The original poster of this method was describing a RISC which had a 'compare registers and branch if equal' op. > >However for tricks like these to work reliably (especially on >systems with memory protection) you had best get help from >your local linker and 'malloc'. Otherwise one day a program >is going to read off the edge of an accessible hunk of memory >and flush your program straight down the drain. Making sure >that no data item ends closer than n-1 bytes (if you're reading >n bytes at time) from a boundary (between accessible and >inaccessible memory) fixes this. Not generally a problem. The whole point of this is that the 32-bit accesses are constrained to 32-bit boundaries to make them fast. Furthermore, in most systems, your access-controlled segments or whatever are rounded up to a size at least as round as the machine word. (The MMU does not usually deal with address lines a0,a1 on a 32-bit machine). Thus if it is legal to read the first byte in a word, it is legal to read the whole word, and thus the last byte. -- ---------------------------------------------------------------------- Greg Smith University of Toronto UUCP: ..utzoo!utcsri!greg Have vAX, will hack...