Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!uakari.primate.wisc.edu!crdgw1!uunet!mcsun!corton!chorus!nocturne.chorus.fr!jloup From: jloup@nocturne.chorus.fr (Jean-Loup Gailly) Newsgroups: comp.compression Subject: Re: Proposed data compression interface standard. Keywords: data compression interface standard Message-ID: <11116@chorus.fr> Date: 21 Jun 91 10:18:20 GMT References: <859@spam.ua.oz> Sender: jloup@chorus.fr Reply-To: jloup@nocturne.chorus.fr (Jean-Loup Gailly) Organization: Chorus systemes, Saint Quentin en Yvelines, France Lines: 72 In article <859@spam.ua.oz>, ross@spam.ua.oz.au (Ross Williams) writes: | 2.2 Parameters | -------------- | 2.2.1 A conforming procedure must have a parameter list that conveys | no more and no less information than that conveyed by the following | "model" parameter list. | | [...] | INOUT memory - A block of memory for use by the algorithm. | [...] How do you deal with segmented architectures such as the 8086 and 286 which impose a small limit (such as 64K) on the size of a segment? (There are only a few million machines still using this architecture :-) You could say that the MEMORY parameter is in fact a small array of pointers to distinct segments, but how does the caller chose the size of each segment? Allocating systematically 64K except possibly for the last segment would generally be a waste of memory. (Take an algorithm requiring two segments of 40K each.) Even on non-segmented architectures, it may be cumbersome to force all memory used by the algorithm to be contiguous. The data structures used by a compression algorithm are usually much more complex than a single linear array, so the algorithm has to map somehow these data structures onto this linear sequence of bytes. This may be difficult with some progamming languages. It is possible instead to add an INIT action which would let the compression algorithm allocate the memory in an optimal fashion and possibly return a failure boolean (or an Ada exception). Of course you also need a CLOSE operation to deallocate this memory. Another objection about the MEMORY parameter as proposed is that it is by definition typeless. Even specific implementations in a strongly typed programming language would have to use a general type which is not suited to the algorithm. So the algorithm is forced to use type conversions (casts in C, unchecked conversions in Ada) which are generally not portable. For example some implementations of Ada store an array descriptor before the array data or in a separate location. (Good implementations avoid the descriptor when the bounds are known at compile time but this is not required by the language.) Such descriptors must be constructed by the compiler. The implementer of the compression algorithm has no way to magically transform a raw sequence of bytes into a properly structured array together with its descriptor. If you let the algorithm allocate the data, then the standard memory allocation routine provided by the language can be used. The resulting pointer(s) (access values in Ada terminology) are opaque objects which can be stored in the MEMORY parameter. There is also at this point a necessary type conversion but it is much less troublesome. The chances of getting back a valid pointer after the reverse conversion are much higher. (Note that an Ada access value may be quite different from a machine address on some implementations.) In languages supporting opaque types such as Ada and C++ it would be preferable to get rid of all the unsafe type conversions completely and use a different type of the MEMORY parameter for each compression algorithm. But again this requires the compression algorithm to export a primitive to allocate this opaque object since the caller of the compression algorithm no longer knows how to allocate it. In short, I suggest that to avoid problems with segmented architectures and/or strongly typed languages, the memory used by a compression algorithm be allocated by the algorithm itself. The IDENTITY action would still determine the maximum amount of memory that the algorithm is allowed to allocate. Jean-loup Gailly Chorus systemes, 6 av G. Eiffel, 78182 St-Quentin-en-Yvelines-Cedex, France email: jloup@chorus.fr Tel: +33 (1) 30 64 82 79 Fax: +33 (1) 30 57 00 66