Path: utzoo!utgpu!jarvis.csri.toronto.edu!cs.utexas.edu!uwm.edu!zaphod.mps.ohio-state.edu!brutus.cs.uiuc.edu!jarthur!polyslo!vlsi3b15!vax1.cc.lehigh.edu!sei.cmu.edu!krvw
From: lambert@cwi.nl (Lambert Meertens)
Newsgroups: comp.virus
Subject: Re: Re universal detectors.
Message-ID: <0009.9002132006.AA19225@ge.sei.cmu.edu>
Date: 13 Feb 90 19:50:21 GMT
Sender: Virus Discussion List <VIRUS-L@IBM1.CC.Lehigh.EDU>
Lines: 69
Approved: krvw@sei.cmu.edu

USERGSVE@LNCC.BITNET (GEORGE SVETLICHNY) writes:
) I'm actually mixing mathematics and technology in the above. The purely
) mathematical conjecture would be (having defined "virus" in some
) appropriate useful way): There is a decidable set W such that 1) all
) programs containing viruses are in W and 2) all programs that do not
) contain viruses *have an equivalent* (identical input-output behaviour)
) that is not in W. Program equivalence is undecidable so this does not
) contradict the undecidability of virus detection. The technological
) part would consiuAof expressing W in some convenient way through
) hardware architecture.
)
) Is this plausible? Frankly, it doesn't sound so to me, but I wouldn't
) discard it off hand. It's the only hope I see of some general
) mathematical result being useful for anti-viral strategies, so maybe we
) should look at it more closely.

To make this amenable to mathematical analysis, we need a precise
definition of "virus", of course, but I think that even without such a
definition there is an argument that makes it plausible that this *is*
in fact mathematically possible.

The argument assumes that we have at least a decidable after-the-fact test
that determines whether a program displayed, in its actual behaviour during
a given run, undesirable (virus-like) behaviour (whether by a bug or by
malicious intent of its author).

(N.B.  I don't know if we can give a useful formal definition of what
constitutes undesirable behaviour, but if we cannot capture this notion,
then any hope of creating a universal virus detector is vain.)

Moreover, it assumes (typical mathematical assumption) that we have
unlimited storage.

Given a program P, we can create another program P' which works as
follows (sketch only):

1.  Make a copy of the store.

2.  Operate like P, without touching the copy.

3.  On termination of P, determine whether undesirable behaviour occurred.

4.  If so, restore the store from the copy and flash a red light;
    otherwise, erase the copy and display a green light.
    (There could be an option to have a bell rung instead of the red
    light:-)

(In case the program does not terminate, we can press an abort button
whereupon the store is restored as well.)

The point is that there is a computable function F such that F(P) = P' with
the above characteristics, such that the range of F is a recursive set.
(The argument that F is a recursive function is simply that it is obvious
how to write a program for it.  That the range is recursive follows from
the fact that we can easily construct F such that larger input -- in some
suitable ordering -- gives rise to larger output, so to test if Y is in the
range of F we can enumerate the domain of F until we find an X such that
|F(X)| >= |Y|.)  Take the set W to be the complement of the range of F.
All benign programs have an equivalent program in W-complement, since F
does not change their input-output behaviour, and all programs in W-
complement are by their construction harmless.

I do not see how this theoretical construction would have much relevance,
but it is amusing to realise that the gist of it is to make a full backup
before you run a non-trusted program, which is indeed a good thing.  Now if
only we could timely detect that infection did occur, then it would be easy
to restore and stop the spreading.

- --Lambert Meertens, CWI, Amsterdam; lambert@cwi.nl