Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!think.com!hsdndev!cmcl2!kramden.acf.nyu.edu!brnstnd From: brnstnd@kramden.acf.nyu.edu (Dan Bernstein) Newsgroups: comp.compression Subject: Re: Reply to Dan Bernstein's criticisms of standard. Message-ID: <28989.Jun2219.07.0391@kramden.acf.nyu.edu> Date: 22 Jun 91 19:07:03 GMT References: <873@spam.ua.oz> Organization: IR Lines: 73 In article <873@spam.ua.oz> ross@spam.ua.oz.au (Ross Williams) writes: > I'm not convinced. I think that allowing algorithm parameters will > open up a really disgusting can of worms. I see no harm in having a > library of exact tuned algorithms. I do. Consider, for instance, my yabba coder. There's not only the memory parameter -m, but a blocking test length -z and ``fuzz'' -Z that affect compression. Twiddling -z and -Z can produce up to 10% improvements on some types of files; there's no small set of best choices, and forcing users into fixed values of -z and -Z, let alone -m, would be insane. > If a user wants to create another tuning, he can just fiddle with the > source code for the algorithm until he is happy and then fiddle the > identity record to create a new algorithm. Uh-uh. This contradicts your stated goal of having the identification numbers mean something for comparison purposes. If you want to achieve that goal, just insist that everyone name all the parameters (in my case, -m/-z/-Z) when quoting compression results. (It gets even hairier when you're measuring speed and memory: for yabba, you'd have to quote MOD and PTRS as well as the machine type.) Don't force parameterized coders into a parameterless model. > (Another argument is that many algorithms can be implemented more > efficiently if their parameters are statically specified). Hardly. > The user of the algorithm doesn't have to deal with oodles of > information to USE an algorithm. Only if they want a non-implemented > tuning. One advantage of a program-based interface is that the user is given options. If he doesn't know about the options or doesn't want to deal with them, he doesn't use them, but they're always there for sophisticated users. You could mimic this in your library routine by passing in an array of options, say of { int, int } pairs, with a method-dependent meaning. [ streams versus blocks ] I think the real argument here is over whether the compressor should drive the I/O or vice versa. In modem applications, your standard is fine: the I/O has to drive the compressor. But in most applications it's much easier to work the other way round. > |> DIRECTNESS: A stream interface will force many algorithms to become > |> involved in unnecessary buffering. > |Your block interface forces *all* algorithms to become involved in > |unnecessary blocking. Again it sounds like you're concentrating on > |zero-delay modem compression. I don't think compressors should be forced > |into that model. > The interface does not force the algorithms into blocking, it forces > the user programs into it. No, it also forces the algorithms into blocking. Many algorithms don't want to deal with blocks. They want to deal with streams. Here, you can notch this up as a separate criticism: You expect algorithms to have enough special cases that they work well if you give them blocks of three bytes at a time. But most real LZW-style compressors work horribly on such short blocks. Most compressors do not support plaintext or escape codes, and it is unrealistic to expect otherwise. > I am wary of a stream interface because it is then impossible to place > a bound on the amount of output one could get from feeding in a single > byte of input (during either compression or decompression). Again it sounds like you're focusing on modems. Why? ---Dan