Path: utzoo!attcan!uunet!tut.cis.ohio-state.edu!zaphod.mps.ohio-state.edu!uakari.primate.wisc.edu!aplcen!haven!mimsy!jds
From: jds@mimsy.umd.edu (James da Silva)
Newsgroups: comp.os.minix
Subject: Re: LHARC available for MINIX!
Keywords: lharc archive compress
Message-ID: <25280@mimsy.umd.edu>
Date: 3 Jul 90 20:35:02 GMT
References: <1990Jul2.143113.2267@jarvis.csri.toronto.edu> <781@rossignol.Princeton.EDU> <1990Jul3.011235.5774@jarvis.csri.toronto.edu> <784@rossignol.Princeton.EDU>
Reply-To: jds@cs.umd.edu (James da Silva)
Organization: University of Maryland, Department of Computer Science
Lines: 47

In article <784@rossignol.Princeton.EDU> nfs@cs.Princeton.EDU (Norbert
Schlenker) writes: 
>This header CANNOT be portable between machines.  It depends on the
>size of short integers (usually 16 bits, but they don't have to be),
>on the size of long integers (usually 32 bits, but ...), on the
>endianness of the machines which read and write the archive, on the
>padding between structure elements, and on the fact that Unix modes,
>uids, and gids all fit into unsigned short integers.

Yes, but there's nothing that says you have to read it directly into the
structure.  Knowing that the "native" layout for LZH headers is, say,
that produced by an MS-DOS compiler, you *can* write a portable routine
to read LZH headers by reading into a char array and extracting the
fields one by one.  Likewise for writing such headers.

I don't know whether the source Wayne Hayes is refering to does this, but
it can be done.  It would be useful to have such a portable program for
Unix and Minix, as LZH files are becoming more popular.

>The solution is a more portable form of header.  As I suggested before,
>something like a TAR header, cut back in size from 512 bytes, will be
>necessary.  Without that, this archive format will not fly.

The problem is that a more "portable" header would be incompatible with the
current LZH header, making the files you produce themselves less
transportable.  You couldn't really call the output LZH.

Actually, I do agree with you; this format will not fly as a generic Unix
file transfer format.  The combined archiver/compresser is too much of a
DOS-ism.

But rather than coming up with a new header format, how about separating
out the LHARC compress/decompress routines and making them standalone, ala
the current compress(1)?  Then we can use tar just like we've always done.
"tar.L" files, anyone?

I do have one question: Does LZH require reading the input twice or
creating a temporary file?  I seem to recall that normal huffman encoding
required one pass to determine the relative frequencies of input tokens,
then another pass to do the encoding.  Does lharc work the same way?  That
would rule out its use in situations where on-the-fly compression through
pipes is needed.

Jaime
...........................................................................
: domain: jds@cs.umd.edu				     James da Silva
: path:   uunet!mimsy!jds	 	    Systems Design & Analysis Group