Path: utzoo!utgpu!news-server.csri.toronto.edu!rutgers!att!linac!pacific.mps.ohio-state.edu!cis.ohio-state.edu!zaphod.mps.ohio-state.edu!samsung!olivea!mintaka!bloom-beacon!eru!hagbard!sunic!mcsun!hp4nl!phigate!prle!prles2!prl.philips.nl!rogier From: rogier@prl.philips.nl Newsgroups: comp.compression Subject: compress text once, decompress many Message-ID: <2700@prles2.prl.philips.nl> Date: 12 Apr 91 07:28:39 GMT Sender: news@prles2.prl.philips.nl Organization: Philips Research Laboratories Eindhoven, the Netherlands Lines: 32 I am looking for a compression scheme which compresses static text, compress once decompress many. One way of doing it is prefix omission, like changing text into pointers to a dictionary which is stored as in the following example: text entry prefix length stored suffix form 0 form formally 4 ally format 4 t The suffices can be compressed by Huffman coding. Ref: "Compression of Concordances in Full-Text Retrieval Systems" Y. Choueka et.al., ACM SIGIR, 11-th conf. on research & development in Information Retrieval, June 1988, Grenoble-France. Another way of doing it is finding the number of occurrences of all possible sub-strings in the text. Then have a good heuristic to pick the sub-strings you are going to put in a dictionary. Questions: Does anyone know good heuristics for this, does anyone know other solutions or references? -------------------------------------------------- Rogier Wester Philips Research Laboratories, The Netherlands. e-mail: rogier@prle.prl.philips.nl