Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!think.com!zaphod.mps.ohio-state.edu!mips!atha!aunro!alberta!ubc-cs!unixg.ubc.ca!cheddar.ucs.ubc.ca!buckland
From: buckland@cheddar.ucs.ubc.ca (Tony Buckland)
Newsgroups: comp.databases
Subject: Re: In Search Of a key compression algorithm!
Keywords: Non-lexigraphically destructive.
Message-ID: <1991May29.211100.12909@unixg.ubc.ca>
Date: 29 May 91 21:11:00 GMT
References: <35014@mimsy.umd.edu>
Sender: news@unixg.ubc.ca (Usenet News Maintenance)
Organization: Computing Services, University of British Columbia
Lines: 16
Nntp-Posting-Host: cheddar.ucs.ubc.ca

In article <35014@mimsy.umd.edu> jerryw@tove.cs.umd.edu (Jerry Wieber) writes:
>I am looking for an algorithm that can compress keys without destroying
>lexigraphic ordering.  A reply to the effect of "this is impossible" is
>equally helpful, of course.  Any compression, no matter how small, may
>be useful. ...

 If you can specify that only a fixed, small character set is
 used in the keys, usually the case since keys tend to be
 alphabetic plus a little punctuation, translate each key byte to
 a 1-byte integer representing the ordering of the characters you
 use (e.g. 00 = blank, 01 = a, ...), figure out the minimum number
 of bits you need to express those integers (5 might do it if you
 want to disregard case, so that a=A, b=B, ...), and compress
 n bytes of key to n*number of bits.  This will get you 3/8 or
 possibly 2/8 shrinkage in the key length, with no loss of
 ordering.