Xref: utzoo comp.sources.d:2011 comp.binaries.ibm.pc.d:74
Path: utzoo!mnetor!uunet!husc6!panda!teddy!jpn
From: jpn@teddy.UUCP (John P. Nelson)
Newsgroups: comp.sources.d,comp.binaries.ibm.pc.d
Subject: Re: Standard for file transmission
Message-ID: <4745@teddy.UUCP>
Date: 4 May 88 13:13:44 GMT
References: <292@cullsj.UUCP> <55@psuhcx.psu.edu> <537@csccat.UUCP> <I> <would> <like> <to> <clear> <up> <a> <couple> <of> <notions> <that> <have> <been> <expressed> <over> <296@cullsj.UUCP>
Reply-To: jpn@teddy.UUCP (John P. Nelson)
Organization: GenRad, Inc., Concord, Mass.
Lines: 77
Keywords: protocol compression source

>  1) COMPRESS is a text only compression routine.  It will not now, or ever,
>     help in the compression of binary files.

Whoa!  Where did THIS come from!?!?  It is simply not true!

It IS true that compress does a better job at compressing text files,
but this is because there is usually more redundency in text files than most
binary files (like executables).  Compress is simply MARVELOUS for
binary files like bit-mapped graphics, getting something like 90%
compression for many of them.

>  2) ARITH is a more general compression routine using adaptive arithmetic 
>     coding.  It will compress binary files where there is redundancy, but
>     when it fails (on an extremely random file) the result increases very
>     little (under 1% in my experience).  It compresses better than HUFFMAN,
>     but it is NOT faster than SQ/UNSQ which are written in assembler whereas
>     ARITH is written in C.
>     (Once again, i will post it if there is sufficient interest.)

Now we get some facts.  ARITH is HUFFMAN encoding. Compress is Lempel-Ziv
encoding.  Lempel-Ziv almost ALWAYS beats HUFFMAN (when there is a redundancy).
It is certainly possible that Lempel-ziv might expand random files more than
HUFFMAN, I haven't done any tests.

Older versions of ARC used to try both HUFFMAN and Lempel-Ziv, and use
the one that gave better compression.  The HUFFMAN support was dropped
(except for extracting from old archives), because Lempel-Ziv beat HUFFMAN
99% of the time!

>  3) The source for ZOO, PKARC, and the others is NOT available.  Therefore
>     we are at the whims of whomever is currently supporting (or not supporting)
>     them.

MORE untruths.  The source for both ZOO and ARC are in C, and have been
distributed on USENET several times!  Some versions of the ARC source
included the extra code to handle the SQUASH compression algorithm
added by PKARC.

>  4) COMPRESS works faster and better on text files then the ARC routines
>     because they use 12 bit compression, where 13-bit (and more) are possible
>     under even the PC for COMPRESS (i've tried it on ans AT-clone).

PKARC's SQUASH is 13 bit compression.  Any more than this requires a
working buffer larger than 64K, which is why they are generally not used
very much on PCs.  The amount of additional compression between 13 bit
and 16 bit is no more than 2 or 3 percent!

Also, there is very little difference in speed between the 12 bit and
13 bit compression algorithms.  The major difference is in the memory
requirements.


>  5) On the weak side, there is as yet, no CRC or checksum for any of these,
>     but adding it would be someithing i am willing to take responsibility
>     for should enough people decide they would like to take the approach
>     which i'm currently suggesting.

This is the LEAST of the problems with using compress.

>     Also, there no directory support provided with these tools.  They work
>     on only one file at a time.  This is also correctable since the source
>     is available.

True, but why reinvent the wheel.  The source for the EXISTING programs is
ALSO available!

>   If this sounds like a flame, then please assign my apparent bad attitude to
>poor methodology rather than a desire to upset people.  This is provided in the
>spirit of adding to what i hope will become a meaningful dialog with a very
>practicle result.  

Your bad attitude appears to be due to an overdose of misinformation!
-- 
     john nelson

UUCP:            {decvax,mit-eddie}!genrad!teddy!jpn
ARPA (sort of):  talcott.harvard.edu!panda!teddy!jpn