Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!think.com!hsdndev!cmcl2!kramden.acf.nyu.edu!brnstnd From: brnstnd@kramden.acf.nyu.edu (Dan Bernstein) Newsgroups: comp.compression Subject: Re: Compression figures Message-ID: <17734:Apr1614:47:4391@kramden.acf.nyu.edu> Date: 16 Apr 91 14:47:43 GMT References: Organization: IR Lines: 34 In article victor@watson.ibm.com writes: > I've just come back from DCC '91 (which was a pretty good conference), > and was reminded of one of my pet peeves: reporting compression > figures. There doesn't seem to be any standard way. I'm all in favor of reporting the original size and the compressed size, at least in research papers. It's hard to get more direct than that. Any extra information is at best a minor convenience. > 1) Orig/New > 2) New/Orig > 3) (Orig-New)/Orig > 4) Orig/(B*New) The problem with (1) is that for a 20K file, the difference between 2K and 1.81K compressed is magnified (1000% versus 1105%), while the vastly more important difference between 40K and 45K compressed is shrunk (50% versus 44%). (4) pleases the information theorist types but is rather annoying to use in practice. Between (2) and (3)... well, data points: Script started on Tue Apr 16 10:43:00 EDT 1991 csh> compress -v < /etc/hosts > /dev/null Compression: 63.46% csh> yabba -v < /etc/hosts > /dev/null In: 311614 chars Out: 103117 chars Y'ed to: 33% csh> yabba -^ < /etc/hosts > /dev/null In: 311614 chars Out: 103117 chars Y'ed by: 67% csh> Script done on Tue Apr 16 10:44:00 EDT 1991 I hope -^ (-v inverted) is sufficiently mnemonic for people who like (3). ---Dan