Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!tut.cis.ohio-state.edu!ucbvax!hplabs!hp-sdd!ucsdhub!calmasd!wlp From: wlp@calmasd.Prime.COM (Walter L. Peterson, Jr.) Newsgroups: comp.ai.neural-nets Subject: Re: PDP programs not working for large nets? Summary: PDP software porting (longish) Keywords: Neural Nets, PDP, pa Message-ID: <216@calmasd.Prime.COM> Date: 23 Feb 89 16:52:26 GMT References: <539@tekno.chalmers.se> <2730@usceast.UUCP> Organization: Prime-Calma, San Diego R&D, Object and Data Management Group Lines: 115 In article <2730@usceast.UUCP>, fenimore@usceast.UUCP (Fred Fenimore) writes: > > [stuff deleted] ... > part of the course, we were to implement some type of project using one > of the simulators availible. What we found with ours was that if you > use BP or PA, then you cannot use the block commands in the .net file. > We tried it on a Vax 11/725 and a Apollo. Both machines gave either > a segmentation fault or out of memory error. We spent some time looking > in the various files to see if we could find the error and to confirm > that it was a real bug in the code or what. The semester ended with > no results so I gave up and coded the project in C. > ... [stuff deleted] Since there have been several questions about porting the PDP code lately, I'll post this rather than e-mailing it. The obvious first question is: are you certain that you declared the block(s) correctly? Getting things out of order could cause the program to attempt to allocate 0 bytes. e.g. if you take the XOR.NET and give it : %r 2 2 2 0 %r 4 1 2 2 rather than : %r 2 2 0 2 %r 4 1 2 2 the incorrect definition of the sending level of the first block will cause the system to attempt to allocate 0 nodes for the sending level and you will get a run-time such as you describe. If your network definitions are correct, then there are several other possibilities; these should be checked anyway, since the PDP code *IS* sensitive to compiler and system differences. First off - the block network definitions DO work. The XOR.NET and XOR2.NET files that are distributed with the PDP software use them for BP and there are other network definition files that use them also. I have made networks with over 100 nodes in 4 layers (in, out, 2 hidden), using the block notation and have found no bugs *in the code that reads or utilizes* this type of definition. Note the asterix above; this emphasis indicates that I did not find bugs in THAT part of the code, I *DID* find problems elsewhere. When I began using the PDP code I found numerous, albeit minor, problems when I compiled, linked and ran it using TURBO-C V2.0 under MS-DOS V3.1 . The problem which you found seems to be the same, or close to one of the ones which I encountered. My first attempt to run the BP program after having re-compiled and relinked it under TURBO-C gave me the "no memory" error. As I was using the XOR.* files that come with the code and had not yet made any mods to the code, I knew that something was not porting correctly. After a bit I found that the PDP code's "shells" arround calloc, malloc and realloc allowed an input parameter of 0 to slip through; if you try to calloc 0 bytes calloc returns NULL and the code *was* testing for that. Having fixed that I was at least able to get started. ( Note: this error happened soon after the copyright notice was displayed, before any display comes up on the screen; did yours do the same ? ). *THEN* I hit the *real* problem. I started getting Floating Point errors. In a program that uses floats for darn near everything, that was real fun to track down :-). ( I need to acknowledge some VERY helpful hints from Walter Bright and Eric Raymond ). The actual problem with the PDP code when ported to compilers and systems other than the one on which it was written ( SUN UNIX ? ) is in the casting of floats to doubles and doubles to floats. The culprits are at the points were there are calls to exp(x) and pow(y, x). I don't have the code here and I don't remember off hand in what functions these occur, but you can use grep to find them. The solution is relatively straight forward. In those functions the return value is computed in the return statement; change that. Add a local variable that is declared as double, do the computations outside of the return statment, BEING VERY CAREFUL ABOUT USING PROPER CASTING. Assign the result to the local variable and then return the local variable . For example: ... double foo; ... foo = exp( < some expression > ); ... return(foo); This simple expedient should solve your problems. Also in the functions that use the pow(y, x) [ that is, y raised to the x ], y is ALWAYS 10, so if your C library provides it, you might want to change this to pow10(x). These casting problems can get nasty and can cause problems that are not easy to track down; however, once you get them fixed the code runs just fine. I have been able to make some rather extensive modifications to the BP code, having gone so far as converting it to use Scott Fahlman's "Quick-Prop" ( see "Proc. of the 1988 Connectionist Models Summer School", Morgan-Kaufman, NY, 1988 ). If you have the time, it might also be helpfull to convert the code from the "old" K&R style to ANSI-C with function prototypes, but that is really not necessary. If you have a LOT of time and you are using TURBO-C or some other system which provides good screen IO routines, you might want to get rid of the CURSES emulation stuff. That will eliminate some unnecessary function calls and for long runs of large models that might help to speed things up. Good Luck,.. -- Walt Peterson. Prime - Calma San Diego R&D (Object and Data Management Group) "The opinions expressed here are my own and do not necessarily reflect those Prime, Calma nor anyone else. ...{ucbvax|decvax}!sdcsvax!calmasd!wlp