Path: utzoo!utgpu!water!watmath!clyde!rutgers!sri-spam!ames!sdcsvax!ucsdhub!esosun!seismo!uunet!sco!chapman
From: chapman@sco.COM (Brian Chapman Mx321)
Newsgroups: sci.crypt
Subject: Re: how do you tell encrytped data from random data?
Message-ID: <275@sysco>
Date: 18 Jan 88 23:34:53 GMT
References: <660@bucket.UUCP>
Reply-To: chapman@sco.COM (Brian Chapman Mx321)
Organization: The Santa Cruz Operation, Inc.
Lines: 38

In article <660@bucket.UUCP> leonard@bucket.UUCP (Leonard Erickson) writes:
< An interesting question has crossed my mind. If someone presents you with
< an allegedly encrypted message, How can you tell if it really is encrypted
< as opposed to being a bunch of random characters?

It is the goal of any good encryption technique, in this class
of techniques, to make the data look random.
Any sort of pattern in the encrypted data is a crack you can
drive a decryption wedge into.

< I know that transposition and *simple* substitution can be detected by
< letter frequency analysis. But is a "flat" distibution evidence of random
< data?
< For my purposes, both "one-time pad" ciphers and anything that operates on
< units other than characters can be considered random! If it is that complex,
< then I'm not likely to crack it!

Yet UNIX crypt is a simple substitution code with the extra twists
that the offset in the file is added to the character (mod 256)
and you change substution tables every 256 characters.
This tends to flatten out a naively taken distrubution.

< (when an 8k msg uses the 26 letters so evenly that the spread
< better most used and least used is 12, you get *real* suspicious :-)
  ^^^^^^ you mean "between"?

You mean the character counts vary between 309 and 321.
Sounds about right to me for random looking data.

Artificialy flat distributions mean that the data is
either a fraud or the encryption method pads with extra
characters.

-- 
			   uunet!-\
Brian Chapman		microsof!-->sco!chapman
			   ihnp4!-/
Pay no attention to the man behind the curtain!  -- The Great & Powerfull Oz