Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!watmath!clyde!burl!ulysses!mhuxr!mhuxn!ihnp4!qantel!lll-lcc!lll-crg!seismo!brl-adm!brl-smoke!gwyn From: gwyn@brl-smoke.ARPA (Doug Gwyn ) Newsgroups: net.crypt Subject: Re: Code Breaking Message-ID: <331@brl-smoke.ARPA> Date: Sun, 27-Apr-86 22:38:39 EDT Article-I.D.: brl-smok.331 Posted: Sun Apr 27 22:38:39 1986 Date-Received: Fri, 2-May-86 22:40:26 EDT References: <113@radha.UUCP> Reply-To: gwyn@brl.ARPA Organization: Ballistic Research Lab (BRL) Lines: 80 In article <113@radha.UUCP> sanand@radha.UUCP (Sanand Patel) writes: >plaintext--> Some very important text that must be kept secret >My Key ----> ThisIsMySecretKeyThisIsMySecretKeyThisIsMySecretK > >Coded-------> Exclusive-Or of S+T, o+h, m+i etc. > >plaintext--> Exclusive-Or of c1+T, c2+h , c3+i etc. >on other side > >Now, not withstanding how the key is passed, is the above scheme breakable, >especially with very large keys (say 100 or 200 letters) ? How hard it is to break this depends on how much ciphertext is available and how long the key is. Step by step: A simple form of Fourier analysis can fairly reliably determine the period of the repeating key. This allows us to "stack" statistics for each cell of the cycle (this method is credited to Kasiski): ThisIsMySecretKey (key unknown except for length) Some very importa (actually, ciphertext is used here nt text that must since the plaintext isn't yet known) be kept secret.. Each column gives a uniliteral frequency distribution. Further, if there is enough text, one may be able to determine that some columns are encrypted under the same key (by cross-correlations, known as a "chi" test after Friedman). Such columns can be merged into a super-stacked frequency distribution: (For simplicity, I assume monocase text & key here.) ThisMyecrK Someer imr nt tt hatu be ptsect o. v.ap... m.ex.t ... e.ke..r... ...y..t... ...t..s... ... ...... If the statistics are good enough, you can assume the most frequent letter in each column is "E", etc. Or, assuming some simple method such as cyclic (Caesar) addition or XOR was used to combine the key with the plaintext, one may be able to check each column distribution against the small number of possibilities (more correlation), and thereby automatically determine the column keys. In other words, this method is not very secure except for very short messages per key. Supposing one were to try to defeat Kasiski analysis by using a non- repeating key? There are two possibilities: (1) random key, as in the one-time pad; (2) nonrandom key, such as text from an agreed-upon book. In the latter case, one tries to simultaneously reconstruct the key text and the plaintext; if you can guess a good crib (the standard example is to assume the plaintext starts off "REFERENCE YOUR MESSAGE ..."), then you can tell quickly that you got it right, since the recovered key is readable. It then is fairly easy to simultaneously unravel the rest of the key and plaintext. The second most hilarious encryption scheme I ever heard of used just a short key to get started, then when the key ran out it used the already-encrypted plaintext to provide the rest of the key, as in: ThisIsMySecretKeySome very important text that mu.. (KEY) Some very important text that must be kept secret.. (PLAINTEXT) (The most hilarious scheme was similar, but instead used the previous ciphertext as key for the remainder of the message! Exercise for the student: Why is this even funnier than the other?) To be really secure without using a one-time key, you have to make sure that the key is unintelligible AND that any structure it has is insufficient to exploit by algebraic statistics. For example, a shift-register (CRC style) pseudo-random sequence has plenty of internal structure and is unsafe for long message encryption.