Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!watmath!clyde!burl!ulysses!bellcore!decvax!decwrl!pyramid!pesnta!hplabs!sdcrdcf!randvax!jim From: jim@randvax.UUCP Newsgroups: net.crypt Subject: Re: Censorship on the net Message-ID: <318@randvax.UUCP> Date: Fri, 23-May-86 11:11:30 EDT Article-I.D.: randvax.318 Posted: Fri May 23 11:11:30 1986 Date-Received: Mon, 26-May-86 01:36:15 EDT References: <3660@sun.uucp> <271@atari.UUcp> <527@polaris.UUCP> Distribution: net Organization: Banzai Institute Lines: 34 Summary: No, it isn't necessarily that easy to detect encryption In article <527@polaris.UUCP> josh@polaris.UUCP (Josh Knight) writes: > >It should be very easy to tell encrypted text from plain text. The >distribution of characters will be very different. Just for example >consider the table below: > > Plain Encrypted > > 17095 555 > 11000 554 > .... 553 > 5800 535 > 3973 532 Not all kinds of encryption will mess up the single-letter frequencies this well. For example, simple substitution (e.g. the ROT13 Caesar cipher used in net.jokes) has the same single-letter frequency as the underlying language. The Bazeries cipher, which combines simple substitution with permutation, would also have the same single-letter frequency distribution. For these it would be sufficient to note that the high-frequency letters are different from English in the sample. However, you can't even count on that: pure transposition systems will leave the individual letters alone and merely shift their locations, so that the single-letter frequency count will still look like English. For most of my stuff I've found that looking at a measure of digraph frequencies seems to do pretty well in general. I mainly use it to tell whether a [possibly modified] brute force run has finally found the answer -- saves eyeballing a lot of printouts. -- Jim Gillogly {decvax, vortex}!randvax!jim jim@rand-unix.arpa