Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!usc!zaphod.mps.ohio-state.edu!uakari.primate.wisc.edu!aplcen!haven!adm!cmcl2!lanl!beta.lanl.gov!scp From: scp@acl.lanl.gov (Stephen C. Pope) Newsgroups: comp.object Subject: Re: Smalltalk performance Message-ID: Date: 1 Nov 90 21:55:16 GMT References: <1990Oct9.190813.23402@ux1.cso.uiuc.edu> <2444@runxtsa.runx.oz.au> <1990Oct19.180646.8649@ux1.cso.uiuc.edu> <1990Oct19.220747.5536@Neon.Stanford.EDU> <1461@media01.UUCP> Sender: news@lanl.gov Reply-To: scp@acl.lanl.gov Organization: Advanced Computing Lab, LANL, NM Lines: 80 In-reply-to: pkr@media01.UUCP's message of 25 Oct 90 09:07:29 GMT on 25 Oct 90 09:07:29 GMT, pkr@media01.UUCP (Peter Kriens) said: [ ... ] Peter> Speed problems occur mostly when we try do "massive" processing.... [ ... ] Peter> What I am trying to say that "running code", cq code which does an awfull Peter> lot of different things and reacts to the user is perfect in Smalltalk. But Peter> the moment you start to handle thousands of "objects", the overhead becomes Peter> sometimes prohibitive. [ ... ] Peter> Don't be fooled by claims that the overhead of Smalltalk is only 30 Peter> percent. This overhead counts only for the comparison between a message Peter> send and a procedure call. The difference between C and Smalltalk is that Peter> in Smalltalk, each line is one or more message sends, in C a lot of statements Peter> are directly expanded to op-codes. Yes. Then, consider the domain of scientific computing, where OOD/OOP have much to offer. Because scientific computations often model ``real world physics'', the object model can be used to create very powerful and intuitive abstractions of real-world phenomena, leading to code which is significantly easier to understand (and play with) than the type of design which leads to highly optimized (vectorized/parallelized) FORTRAN code, the incumbent with which it must in some sense compete. The kinds of overhead implied by Smalltalk, although comparatively trivial when treating complex composite types of coarse granularity, are simply unacceptable when you want to do simple arithmetic operations on collections (arrays) of ``fundamental'' types such as floating point values. In many scientific codes these "complex composites" are important as the means to achieve coherent and transparent design (via abstraction and encapsulation). They are as such the ``products'' of the design phase. However, on the computational side, it is really the simple arithmetic operations on floats and such which matters; the side of the coin which OOP has a ways to go before it delivers something of interest to the average burner of sumpercomputer cycles. With a language such as C++, there is some hope that significant portions of the computational/numeric sides of scientific codes may get some treatment; even lacking a vectorizing C compiler, it is not difficult to encapsulate essential behavoir within class whose essential behavoir is implemented via carefully crafted code [and even calls to fortran routines]. This is possible because the fundamental types are exactly those types which the underlying hardware supports (more or less) directly. If these so called fundamental types are buried (in implementation! The abstraction of the program model can be whatever you like) underneath the bagagge of message lookup and indirection, you're going to lose. The existance of an OOL which directly supported the notion of data-parallelism (particularily if integrated fully with the OO model, not just for fundamental types) would go a long way to address some of these very real efficiency issues. Lacking that, most will continue to work in FORTRAN, some will experiment with C++, and few will vemture elsewhere. Peter> Even though we realize that the overhead is there, we have found that Peter> the increasing hardware speed allows us to develop a LOT faster and Peter> make much more stable code which also usually looks a lot nicer. Unfortunately, the speedups made possible by pushing up the clock rate and such still don't come close to the speedups possible through vectorization and parallelization. In the supercomputing world, it is typical to put person-years of effort into squeezing every last ounce of preformance out of one piece of code and very expensive machinery; even with 10's of gigabytes of memory and 10s of thousands of processing elements, our machines are still too wimpy to do some very interesting and important work. stephen pope advanced computing lab los alamos national laboratory scp@acl.lanl.gov