Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!sun-barr!olivea!uunet!bellcore!grumpy!bytor From: bytor@grumpy.Berkeley.EDU (Ross Huitt) Newsgroups: comp.software-eng Subject: Re: OOP in the "real world" Message-ID: <1991Jun27.194748.732@bellcore.bellcore.com> Date: 27 Jun 91 19:47:48 GMT References: <1991Jun27.165340.22545@den.mmc.com> Sender: usenet@bellcore.bellcore.com (Poster of News) Organization: Bellcore Lines: 62 To: randy@tigercat.den.mmc.com (Randy Stafford) Cc: bytor duncan paul I decided to take this off line until we clear this up. If you think things are clear, then post this back to the net. I find calculating metrics on Smalltalk code a little troublesome. In C and other prodcedural languages, and to a lesser extent C++, counting statements does make some sense. But counting statements in Smalltalk is dubious at best. You tend to see these very large cascades of expression that contain a lot of functionality. So, when it came time to count Smalltalk methods I used the following rules: 1) Don't count blank lines. 2) Don't count lines with just comments. 3) Count remaining newlines in the source of the method as LOCs. I hate counting newlines for metrics for any reason, but right now I'll live with these rules for Smalltalk. Please note, however, that this is not your definition of SLOC. Looking at the Smalltalk/V image and a couple of medium (100+class) applications indicated averages around 3 lines of code per method. I didn't count bytes but I would venture a guess that that lines were around 30-40 bytes as you suggested. Metrics for C++ are quite a bit easier. I count executable statements, in particular all statements as defined in the ARM grammar except labeled-statements and compound-statements. Metrics for the NIH libraries, several publically availible systems as well as a couple of production systems had averages in the three to five statement per method range. Also, the more 'object-oriented' the system is the lower the average will be. If you triple these stmt-per-method numbers for C++ it provides a fair approximation of the number of raw lines of source code. The tripling I suggested was for the 42K LOC number. It may (or may not) provide a rough indicator for the number of actual lines (newlines/SLOC) in the source code of the method. Doubling may be more accurate as suggested by your 85K number. So, maybe we just misundertood each other's definition of LOC. I like the idea of trying to estimate the LOC per method based on the image size and method count, but I don't do metrics full-time so checking this out will have to wait. I don't know if it will work, but I don't think anybody else does either. My main point is that the number of executable statments per method in an object-oriented C++ system will be very low, especially if the Law of Demeter is adhered to. My assertion is that 'very low' will be less than 4 statements for most systems. The number of statements per method in a C system are typically greater then 10 statements per function for the systems I have looked at. I think this difference in statement-per-function/method is significant and will have very great impact on maintenance. So, it appears that a better estimate for the SLOC (where SLOC is the number of actual physical lines of raw source code) of the Analyst is around 85-100KSLOC. This assumes, of course, that you, Dr. Love and I are defining SLOC in the same manner. (I still assert that there is no definition of SLOC that would yield 350KSLOC for that system, which is the reason I posted in the first place.) I hope this clears things up for now. Ross Huitt bytor@ctt.bellcore.com