Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!think.com!samsung!crackers!m2c!risky.ecs.umass.edu!dime!chelm.cs.umass.edu!yodaiken
From: yodaiken@chelm.cs.umass.edu (victor yodaiken)
Newsgroups: comp.lang.misc
Subject: Re: Will this *thread* ever halt?
Message-ID: <30831@dime.cs.umass.edu>
Date: 21 May 91 21:31:28 GMT
References: <gZy9221w164w@mantis.co.uk> <30814@dime.cs.umass.edu> <3409@optima.cs.arizona.edu>
Sender: news@dime.cs.umass.edu
Organization: University of Massachusetts, Amherst
Lines: 67

In article <3409@optima.cs.arizona.edu> debray@cs.arizona.edu (Saumya K. Debray) writes:
>yodaikenchelm.cs.umass.edu (victor yodaiken) writes:
>
>> I argue that it is possible, in theory, for a C compiler to detect all
>> paths that lead to non-terminating loops and to flag these paths. 
>[ ... ]
>> If we give X.c to the hypothical smart compiler and it tells us that
>> X.c contains a path that exceeds the process memory limits on this
>> machine, or that X.c contains a path that will cause it to enter a
>> non-terminating loop if "alloc" fails, then the compiler has solved
>> the halting problem for this code.
>
>Good.  Perhaps Mr. Yodaiken would care to convince us infidels once and
>for all by posting some source code to do what he has so eloquently
>argued can be done?


I'm flattered that you have so much confidence in my abilites: if I can't
come up with an algorithm, then clearly, such an algorithm cannot exist.
There is an obvious, and obviously impractical algorithm, however. Let's
first consider the case of programs that just compute a function of 
their arguments: i.e. programs that perform no i/o but  can be written
as subroutines of the form prog( type: x, type: y .... ). 
For example, suppose we have a C function  "int f(x,y,z) int x,y,z ..."
Now we look at how "ints" are implemented on the target machine, suppose
that each int is a 32 bit value. We then emulate f(x,y,z) for each of
the mere 2^96 possible assignments. The emulation consists of single
stepping the computation of "f" either on the target computer or within
a program emulating this computer. At each step, we make a copy
of the entire memory space of the emulation. The program is sure to
either repeat state (a non-terminating loop) or exit within a bounded
amount of time because it can only reference a bounded amount of memory.

[BTW, this is not as impractical as it seems. H. Massalin has written
an ingenious assembly language optimizer that, given a short
assembly language routine that computes a simple result, will find the
shortest possible sequence of assembly statements that computes the
same function. Masslin's "Super-optimizer" essentially makes an
exhaustive search of the set of all 1 line assembly programs, then all
2 line programs, ..... With some smart heuristics the optimizer comes up with
some incredible sequences of code (e.g., a min function that has no
branch). There is a paper on the optimizer in some recent ASPLOS
proceeedings. So, enough of this whining about what we can't do. ]

Now we return to the impractical algorithm and extend it to programs that
have system calls. Ther are a couple of strategies here, but we'll take
the dumbest, since it requires the least effort  from me. Each system
call, i/o driven or not, changes program state in only a bounded number
of ways. For example, a call "read(filedescriptor,bufferaddr,n)"
will change the values of the "n" bytes of memory begining a 
"bufferaddr" in some way. Whenever we encounter such a call in the
"C" program, we simply try all 2^(n*8) possibilities. Similarly, with
calls to "alloc" which either add some non-determined data to the program
data space (up to a known limit) or return a fail code. We just test
all branches. Pretty simple, hey? 

This algorithm will work. And in some cases it might even be practical.
In hardware design it is already common to build special purpose machines
that do brute force emulations of chip behavior.  Is there a smarter
algorithm, a better programing language, a well defined o.s. interface
that will make this type of analysis practical without special purpose
hardware? No idea. But, there is emphatically no mathematical theorem
that eliminates the possibility of such an algorithm.

>The proof of the pudding, Mr. Yodaiken, is in the eating.  You've argued
>that it's possible to decide the halting problem for "real computers".  
>So show us.