Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.2 9/5/84; site umd5.UUCP Path: utzoo!watmath!clyde!bonnie!akgua!whuxlm!whuxl!houxm!ihnp4!mhuxn!mhuxr!ulysses!allegra!mit-eddie!genrad!panda!talcott!harvard!seismo!umcp-cs!cvl!umd5!zben From: zben@umd5.UUCP Newsgroups: net.nlang,net.dcom,net.micro Subject: Re: Squeezing files. Message-ID: <604@umd5.UUCP> Date: Thu, 20-Jun-85 19:38:18 EDT Article-I.D.: umd5.604 Posted: Thu Jun 20 19:38:18 1985 Date-Received: Sun, 23-Jun-85 04:34:11 EDT References: <1414@ecsvax.UUCP> <784@turtlevax.UUCP> <1861@ukma.UUCP> <789@turtlevax.UUCP> Reply-To: zben@umd5.UUCP (Ben Cranston) Organization: U of Md, CSC, College Park, Md Lines: 47 Xref: watmath net.nlang:3237 net.dcom:1046 net.micro:10849 Summary: This might be interesting to the nlang folks... I added net.nlang to the group header, because we are getting into that area, and because I though the nlang people might be interested in this discussion. In article <789@turtlevax.UUCP> ken@turtlevax.UUCP (Ken Turkowski) writes: >In article <1861@ukma.UUCP> sean@ukma.UUCP (Sean Casey) writes: >>In article <784@turtlevax.UUCP> ken@turtlevax.UUCP (Ken Turkowski) writes: >>>I think you should consider changing to Lempel-Ziv Compression (posted >>>to the net as "compress", version 3.0), which normally gives 70% >>>compression (30% of original size) to text. The program is fast, and >>>adapts to whatever type of data you give it, unlike static Huffman >>>coding. It usually produces 90% (!) compression on binary images. >> >>Lempel-Ziv doesn't do NEARLY that well. We've been using it for >>months, and we've found that text and program sources usually get about >>55-65% compression, while binaries get about 45-55% compression. > >I can see that we have a semantic problem here. By "image", I mean a >picture, or two-dimensional signal. By "binary", I mean ones and >zeros, black and white, no grey-scale, no color. > >I'm curious; what is the etymology of the word "binary" as it is >sometimes used to refer to executable machine code? And why does it >imply program rather than data? I remember way back when IBM was the only game in town, they called the output decks produced by compilers "relocatable binaries". The Univac system I grew up on has both "relocatable elements" and "absolute elements", the latter sort of like "load modules" on current IBM systems, programs linked and ready-to-run, but incapable of further modification. So, Univac dropped the "binary" part. It seems another branch in the etymology of these beasts dropped the "relocatable" and just ended up with "binary", on many systems there is not an "absolute" form, so the distinction was not needed. Now, "image", to my mind, implies something else entirely. It implies a strict one-for-one correspondance between words in-core and words in the file. By this definition, neither the Univac (very tightly-packed format) nor the usual Unix (because of BSS) implementations apply. I understand on the old TOPS-10 system a running program could write a copy of itself out to the file system, which could then later be executed and pick up where it had started. THIS qualifies as an "image". Any takers? -- Ben Cranston ...{seismo!umcp-cs,ihnp4!rlgvax}!cvl!umd5!zben zben@umd2.ARPA