Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!usc!wuarchive!rex!uflorida!gatech!purdue!haven!ncifcrf!lhc!adm!smoke!gwyn From: gwyn@smoke.brl.mil (Doug Gwyn) Newsgroups: comp.sys.apple2 Subject: Re: CRCs Message-ID: <15915@smoke.brl.mil> Date: 20 Apr 91 22:47:39 GMT References: <13688@ucrmath.ucr.edu> Organization: U.S. Army Ballistic Research Laboratory, APG, MD. Lines: 101 In article <13688@ucrmath.ucr.edu> rhyde@ucrmath.ucr.edu (randy hyde) writes: >Assembly language typically works out a little better than twice as fast. But the comparison was not really fair, since the assembly-language example implementation did not support the same linkage as the C code. For a function that performs so little work, function linkage overhead is an important factor. Also, the example C code was not optimal for the given algorithm. By making a few minor tweaks to it I reduced the C example run time by a factor of 26% on the host on which I read net news. Profiling the code showed that approximately as much time was being spent in the main program's loop overhead as in the crc function itself. For the revised crc function C source code: int crc(data, curcrc) int data; register unsigned int curcrc; { return (curcrc << 8) ^ crctab[(curcrc >> 8) ^ data]; } a fairly ordinary old-technology C compiler on a Gould PowerNode produced the following machine code: _crc: enter 8w,0x5 -- function linkage movw 9w+8w[b2],r1 -- pick up "curcrc" argument movw r1,r0 -- make a second copy of it lslw #8,r0 -- shift one copy left (accumulator) lsrw #8,r1 -- shift the other copy right xorw 8w+8w[b2],r1 -- XOR shifted copy with "data" argument lslw #2,r1 -- shift for indexing crctab[] xorw _crctab[r1],r0 -- XOR table entry with accumulator retn -- function linkage and a similar compiler on a DEC VAX produced the following machine code: _crc: .word 0x800 ; function linkage movl 8(ap),r11 ; pick up "curcrc" argument extzv $8,$24,r11,r0 ; shift one copy left (accumulator) xorl2 4(ap),r0 ; XOR shifted copy with "data" argument ashl $8,r11,r1 ; shift the other copy right xorl3 r1,_crctab[r0],r0 ; XOR table entry with accumulator ret ; function linkage Note that there isn't very much that one could do even by hand-coded assembly language to perform this computation much more efficiently. (Except that the VAX actually has microcode support for a CRC instruction! That is an example of a situation in which use of assembly language could be justified, since it is somewhat rare for a C compiler to exploit such specialized instructions. Another possible example would be in a bottleneck loop when testing the "carry bit" would be much more efficient than trying to do without it; originally I thought that was going to be the justification for coding the CRC algorithm in assembler.) >I claim the ASM code above isn't any more difficult to understand than the >C code (assuming, of course, that you know the algorithm that the code >is attempting to implement). For such a small function, the conceptual burden is indeed manageable. For a larger example involving linked data structures, etc. I would prefer to deal with C code over assembly code. >From a development point of view, it would take far more time to dream >up, implement, and verify the nibble-based C code that Doug posted than >it would take to implement the unrolled version of the original CRC >algorithm in assembly. But that's comparing apples and oranges. For the same algorithm with the same constraints (particularly function linkage), I don't think you've shown that assembly language code is appreciably faster than C code, taking into account my earlier observations. >I offer this as proof that assembly is often the better choice for a >project. Sure, you can find better algorithms in C which run faster than >bad algorithms in ASM, but the end result is that you spend *far* more >time designing, testing, and debugging the C code. And if you can find >a better algorithm that works in C, that same algorithm will work in >assembly as well. The C code, if you do it right, is immediately usable on a variety of machines, and development cost of the algorithm is independent of the implementation language. Implementation of the algorithm in C is no harder than in assembly language, and for complex algorithms it is very nice to be able to rely on the compiler for help with the bookkeeping. Why do you think high-level languages were invented? >Keep in mind, that the CRC algorithm is intrinsically O(n). You're not >going to do any better than this because it takes n operations to read >the data. Therefore, the only thing we can attack is the constant >in the complexity of this algorithm. ASM is really good at this. And if you look at the above C compiler output, you'll see that C is also really good at this, at least on decent machine architectures. As I have previously remarked, there are legitimate uses for assembler, but since C is about as fast, infinitely more portable, and easier to use for large projects, I recommend C over assembler in the majority of situations.