Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!utgpu!utcsri!greg
From: greg@utcsri.UUCP
Newsgroups: comp.arch,comp.lang.c
Subject: Re: String Processing Instruction
Message-ID: <4605@utcsri.UUCP>
Date: Thu, 16-Apr-87 22:39:26 EST
Article-I.D.: utcsri.4605
Posted: Thu Apr 16 22:39:26 1987
Date-Received: Fri, 17-Apr-87 04:39:48 EST
References: <15292@amdcad.UUCP> <693@jenny.cl.cam.ac.uk> <7349@boring.mcvax.cwi.nl>
Reply-To: greg@utcsri.UUCP (Gregory Smith)
Organization: CSRI, University of Toronto
Lines: 32
Xref: utgpu comp.arch:887 comp.lang.c:1631
Summary: 

In article <693@jenny.cl.cam.ac.uk> am@cl.cam.ac.uk (Alan Mycroft) writes:
>   #define has_nullbyte_(x) ((x - 0x01010101) & ~x & 0x80808080)
>Then if e is an expression without side effects (e.g. variable)
>   has_nullbyte_(e)
>is nonzero iff the value of e has a null byte.

If one of the bytes contains 0x80, then has_nullbyte() will
say 'yes'. This can be circumvented by a more thorough test
after this one to see if there really is a null there.

Jack Jensen's subroutine does this apparently by accident;
there is a 'while( *tc++ = *fc++ );' after the test finds a 'null'
so the only effect of having a 0x80 byte will be to revert to
the normal strcpy for the rest of the string.

Someone posted a similar but usually better test on comp.arch ( I think )
a little while ago:

#define has_nullbyte(e)  ((e+0x7efefeff)^e)&0x81010100) != 0x81010100

This one is only 'fooled' by an 0x80 in the most significant byte.
which makes the following test much simpler ( a sign test ).
In either case ( especially in this one with two identical
constants ) it helps a lot if the constants are in registers while
the loop is running. When using C, you can do this explicitly:

register k1=0x7efefeff, k2=0x81010100;

-- 
----------------------------------------------------------------------
Greg Smith     University of Toronto      UUCP: ..utzoo!utcsri!greg
Have vAX, will hack...