Path: utzoo!utgpu!watserv1!watmath!att!pacbell!pacbell.com!ucsd!usc!cs.utexas.edu!yale!cmcl2!adm!news
From: OEYO8722%TREARN@pucc.princeton.edu ( Hur AKDULGER)
Newsgroups: comp.lang.pascal
Subject: (none)
Message-ID: <24386@adm.BRL.MIL>
Date: 4 Sep 90 18:37:30 GMT
Sender: news@adm.BRL.MIL
Lines: 53


!> From: Lane E Buchanan <bucky@UWYO.BITNET>

!> What are you trying to do with compression?

Lane;

I want to write a DICTIONARY (turkish-english, english-turkish).
When you give a word (key), you'll access the meaning of the word.
I want to compress the file which includes the meanings.
Because it is too large.

I'm thinking to convert the most repeated words(two chars) to the unused chars.
Example:

OUR_STRING = "ABDQXYZTABDQABWXDQABABAB"

first we're seperating the OUR_STRING into words
and we're counting repeated two bytes. (I'm using an array).
As "AB" "DQ" "XY"...

AB repeated 6 times
DQ repeated 3 times
XY repeated 1 times
ZT repeated 1 times
WX repeated 1 times

and now we're converting the most repeated words (greater than 1)
to unused chars. Unused characters are (K)(L)(M)(N)(O)(P) in the OUR_STRING.
"AB" will be "K",                  ("K" is "AB" at step 1)
"DQ" will be "L".                  ("L" is "DQ" at step 1)

New OUR_STRING is "KLXYZTKLKWXLKKK"

We process the OUR_STRING until repeat count of words is 1 or we've got
 no unused chars.

UnCompress algorithm :
"ABK", "DQL"
When you see the "K", change it by "AB" and when you see "L" change it
by "L"

"KLXYZTKLKWXLKKK" -----> will be ------> "ABDQXYZTABDQABWXDQABABAB"
 compressed string, will be uncompressed.


Did you understand this algoritm? It's not complex, but I didn't try it -))

If I find better alg. than this alg. I'll use it.
I read huffman alg. (too many times), I'm cannot coding it.
  (I sorry for my english *:-(

Hur AKDULGER...........