Path: utzoo!mnetor!uunet!ncc!alberta!edson!tic!dragos!work From: work@dragos.UUCP (Dragos Ruiu) Newsgroups: comp.dcom.modems Subject: Re: Merits of compression Message-ID: <410@dragos.UUCP> Date: 27 Apr 88 21:33:58 GMT References: <580@spectrix.UUCP> <406@dragos.UUCP> <1473@pt.cs.cmu.edu> <1529@pt.cs.cmu.edu> Organization: UNIX Device, Canada Lines: 57 Summary: You can do 12 bit LZW with just 32K In article <1529@pt.cs.cmu.edu>, tgl@ZOG.CS.CMU.EDU (Tom Lane) writes: > In article <580@spectrix.UUCP>, John_M@spectrix.UUCP (John Macdonald) writes: > [Talks about compressing incoming uncompressed data to keep LZW tables good] > Yeah, that's the obvious thing to do. The trouble is that compression > and decompression tables are not the same thing; if you think about it, > they perform reverse functions, so it's not surprising that they need > to be indexed differently. For LZW compression like the TrailBlazer > uses, the compression table is usually a hash table indexed by > ; while the decompression table can > be a simple array indexed by . > > I haven't come up with any better way of doing this idea than having > the receiver maintain BOTH compression and decompression tables, > updating both tables whenever a packet is received (using either the > compression or decompression algorithm, as required). This is a > nontrivial hit in CPU terms, but the place where it really hurts you is > memory space. A compression table for 12-bit-symbol LZW (TB algorithm) > is 24KB or more, depending on what kind of hash table load factor > you're willing to tolerate (= more CPU hit...). A TrailBlazer hasn't > got that kind of memory to spare, I would imagine. > > -- > tom lane > Internet: tgl@zog.cs.cmu.edu > UUCP: !zog.cs.cmu.edu!tgl > BITNET: tgl%zog.cs.cmu.edu@cmuccvma First a couple of points: -When decide to turn compression off, keeping the tables updated is probably of little significance, because the codes being built are useless. -Last I heard the Telebit was reputed to have 512K of ram. You can keep a compression table and a decompression table simultaneously. Simply have the hash table indicate entries in the (de)compression table instead of keeping the code and the follow. (This saves you memory even if you don't care about keeping a decompression table.) The disadvantage of this approach is that you take an extra indirection hit on every hash table lookup. The hash table gives you the entry number, which you must then look up in the entry table. A minimal memory setup for bi-directional compression is about 32K: 10K decompression table: 4096 * (1.5 byte code + 1 byte follow) 10K compression entry table: " " 12K hash table: 8096 * 1.5 byte compression table entry numbers. This has a 50% hash table occupancy. And if you wanted to do 'compression' on the decompressed input like John MacDonald suggested, you have to add another 12K hash table that points to entries in the decompression table. -- Dragos Ruiu ruiu@dragos.UUCP "One can ceratinly imagine the myriad of ...alberta!dragos!ruiu uses for a hand-held iguana maker." -Hobbes