Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!think.com!samsung!crackers!m2c!risky.ecs.umass.edu!dime!chelm.cs.umass.edu!yodaiken From: yodaiken@chelm.cs.umass.edu (victor yodaiken) Newsgroups: comp.lang.misc Subject: Re: Will this *thread* ever halt? Message-ID: <30831@dime.cs.umass.edu> Date: 21 May 91 21:31:28 GMT References: <30814@dime.cs.umass.edu> <3409@optima.cs.arizona.edu> Sender: news@dime.cs.umass.edu Organization: University of Massachusetts, Amherst Lines: 67 In article <3409@optima.cs.arizona.edu> debray@cs.arizona.edu (Saumya K. Debray) writes: >yodaikenchelm.cs.umass.edu (victor yodaiken) writes: > >> I argue that it is possible, in theory, for a C compiler to detect all >> paths that lead to non-terminating loops and to flag these paths. >[ ... ] >> If we give X.c to the hypothical smart compiler and it tells us that >> X.c contains a path that exceeds the process memory limits on this >> machine, or that X.c contains a path that will cause it to enter a >> non-terminating loop if "alloc" fails, then the compiler has solved >> the halting problem for this code. > >Good. Perhaps Mr. Yodaiken would care to convince us infidels once and >for all by posting some source code to do what he has so eloquently >argued can be done? I'm flattered that you have so much confidence in my abilites: if I can't come up with an algorithm, then clearly, such an algorithm cannot exist. There is an obvious, and obviously impractical algorithm, however. Let's first consider the case of programs that just compute a function of their arguments: i.e. programs that perform no i/o but can be written as subroutines of the form prog( type: x, type: y .... ). For example, suppose we have a C function "int f(x,y,z) int x,y,z ..." Now we look at how "ints" are implemented on the target machine, suppose that each int is a 32 bit value. We then emulate f(x,y,z) for each of the mere 2^96 possible assignments. The emulation consists of single stepping the computation of "f" either on the target computer or within a program emulating this computer. At each step, we make a copy of the entire memory space of the emulation. The program is sure to either repeat state (a non-terminating loop) or exit within a bounded amount of time because it can only reference a bounded amount of memory. [BTW, this is not as impractical as it seems. H. Massalin has written an ingenious assembly language optimizer that, given a short assembly language routine that computes a simple result, will find the shortest possible sequence of assembly statements that computes the same function. Masslin's "Super-optimizer" essentially makes an exhaustive search of the set of all 1 line assembly programs, then all 2 line programs, ..... With some smart heuristics the optimizer comes up with some incredible sequences of code (e.g., a min function that has no branch). There is a paper on the optimizer in some recent ASPLOS proceeedings. So, enough of this whining about what we can't do. ] Now we return to the impractical algorithm and extend it to programs that have system calls. Ther are a couple of strategies here, but we'll take the dumbest, since it requires the least effort from me. Each system call, i/o driven or not, changes program state in only a bounded number of ways. For example, a call "read(filedescriptor,bufferaddr,n)" will change the values of the "n" bytes of memory begining a "bufferaddr" in some way. Whenever we encounter such a call in the "C" program, we simply try all 2^(n*8) possibilities. Similarly, with calls to "alloc" which either add some non-determined data to the program data space (up to a known limit) or return a fail code. We just test all branches. Pretty simple, hey? This algorithm will work. And in some cases it might even be practical. In hardware design it is already common to build special purpose machines that do brute force emulations of chip behavior. Is there a smarter algorithm, a better programing language, a well defined o.s. interface that will make this type of analysis practical without special purpose hardware? No idea. But, there is emphatically no mathematical theorem that eliminates the possibility of such an algorithm. >The proof of the pudding, Mr. Yodaiken, is in the eating. You've argued >that it's possible to decide the halting problem for "real computers". >So show us.