Xref: utzoo comp.theory:1527 gnu.misc.discuss:2393 sci.crypt:4176 sci.misc:4798 Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!usc!zaphod.mps.ohio-state.edu!uakari.primate.wisc.edu!dali.cs.montana.edu!ogicse!hsdndev!cmcl2!uupsi!sunic!fuug!demos!news-server From: Serge.Volkoff@ippi.msk.su (Serge Volkoff at IPPI Moscow USSR) Newsgroups: comp.theory,gnu.misc.discuss,sci.crypt,sci.misc Subject: Re: CALL FOR DISCUSSION: Create sci.compression? Keywords: sci.compression data compression new newsgroup creation Message-ID: <9102120717.AA17688@jumbo.hq.demos.su> Date: 12 Feb 91 06:32:58 GMT Sender: news-server@jumbo.hq.demos.su Reply-To: Serge.Volkoff@ippi.msk.su Followup-To: news.groups Organization: unknown Lines: 39 In <1991Feb11.004336.26106@rand.org> Ed Hall writes: >Data compression is meaningless outside of a computer context; it is >arguably as much a part of computer science as compiler writing or >microprocessor design. Although little practical cryptography is >done without computers these days, for historical reasons >cryptography is viewed as having an existance outside of the aegis >of computer science. Let me note that data compression as part of information and coding theory is generally called `source modeling and coding.' Surely, it's viewed by many people as a purely computer-science topic, but as originated by Shannon information-theoretic works and continued by such brilliant scientists as David Huffman, Robert Gallager, Abraham Lempel, Jacob Ziv, Jorma Rissanen, and Glen Langdon, it is certainly part of information theory. Just open `IEEE Transactions on Information Theory' and find in each issue at least one paper devoted to source coding. Source coding isn't as old as cryptography, but at least it appeared earlier than most of the computer-science topics. I wonder why so weak methods are always selected for practical archivers. Isn't it because of ignorance of theoretical results in source coding. All archivers known to me are based on numerous modifications of two Lempel-Ziv algorithms (LZ77 and LZ78). As source coding analysis shows, these are only asymptotically optimal, and the coding rate (the inverse of compression ratio) approaches the entropy extremely slowly. Very powerful universal modeling and coding methods exist (due to Rissanen and Langdon, Cleary and Witten, et al.), primarily those based on variable-context Markov models and arithmetic coding technique, a purely theoretical achievement of Jorma Rissanen. These methods are several times slower but they achieve nice compression ratios for _all_ files, often _far_ beyond the current limits set by the best existing programs: compress, pkarc, pkzip, lharc, and arj. I vote YES to sci.compression. This would be exciting to have an opportunity to discuss this topic with both software designers and researchers working in source coding and computer science. __ _ _ (__` _ _ _ _ | | _ ||, _ |_|_ .__)(-'| (_)(-' \/ (_)||\(_)| | ._)