Path: utzoo!utgpu!water!watmath!clyde!rutgers!mcnc!ecsvax!hes
From: hes@ecsvax.UUCP (Henry Schaffer)
Newsgroups: sci.crypt
Subject: Re: how do you tell encrytped data from random data?
Summary: variability of appearance of characters in random data
Message-ID: <4454@ecsvax.UUCP>
Date: 20 Jan 88 04:53:44 GMT
References: <660@bucket.UUCP> <275@sysco>
Organization: NC State Univ.
Lines: 26

In article <275@sysco>, chapman@sco.COM (Brian Chapman Mx321) writes:
> In article <660@bucket.UUCP> leonard@bucket.UUCP (Leonard Erickson) writes:
> < An interesting question has crossed my mind. If someone presents you with
> < an allegedly encrypted message, How can you tell if it really is encrypted
> < as opposed to being a bunch of random characters?
> ... 
> You mean the character counts vary between 309 and 321.
> Sounds about right to me for random looking data.
  
I'm taking "random" as meaning multinomial with equi-probability.  In
this case - looking at any one character should be found about 8k/26=307
times, with a standard deviation of about 17.  That makes a spread of 12
between the most and least frequent of the 26 look quite small, indeed.

> Artificialy flat distributions mean that the data is
> either a fraud or the encryption method pads with extra
> characters.

Hmm, is this correct?  I don't know how flat the output of something like
Huffman encoding can be.  I know it isn't encryption - but perhaps something
like this in conjunction with encryption would give a flatter-than-random
character distribution without character padding.
> -- 
> Brian Chapman		microsof!-->sco!chapman

--henry schaffer  n c state univ