Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!csd4.csd.uwm.edu!mrsvr.UUCP!shoreland.uucp!hallett From: hallett@shoreland.uucp (Jeff Hallett x4-6328) Newsgroups: comp.software-eng Subject: Re: C source lines in file Message-ID: <895@mrsvr.UUCP> Date: 18 Aug 89 17:47:23 GMT References: <35120@ccicpg.UUCP> <16018@vail.ICO.ISC.COM> Sender: news@mrsvr.UUCP Reply-To: hallett@shoreland.UUCP (Jeff Hallett x4-6328) Organization: GE Medical Systems, Milwaukee, WI Lines: 69 In article <16018@vail.ICO.ISC.COM> rcd@ico.ISC.COM (Dick Dunn) writes: >swonk@ccicpg.UUCP (Glen Swonk) writes: >> Does anyone have a program or a method of determing >> the number of C source lines in a source file? >> My assumption is that comments don't count as source >> lines unless the comment is on a line with code. In my former job, we came up with a way to measure C lines in a way that suited us. The basic approach was to 1. Remove all comments 2. Ensure that there was only 1 "statement" of code per textual line (a stmt here may be a curly brace or null stmt (solitary ;)) 3. Removed all blank lines, braces and ; with no text with them. 4. Removed all 'do' keywords (they do no work). 5. Pulled all broken function calls together on one line (ie. where a newline was inserted between parameters to make the call prettier) 5. Count the lines which are left. Granted, this implies some "sanity" on the part of the programmer not to do some really weird things (like put the ; for a statement on the line below the statement), but on the whole this procedure (done mostly with sed scripts) produced what we would have done by hand. >it's clear you're off on the wrong foot. A count of source lines is NOT a >useful measure of program size or complexity. Incidentally, be careful >about the difference between size and complexity! > Excellent point about size vs. complexity. However, "size" is a nebulous term (more below). > >I offer two rules about measuring program size/complexity: > >1. Any variant of "source line count" is useless as a measure of the >program. > I've heard countless times the rationalization that "Well, it may > not be good, but it's the best we can do." This is WRONG! It's > worse than no measure at all. It implies that you have information I agree that LOC really is a bad measure of productivity, but so are most of the items listed by Dick in his earlier posting. Productivity of a coder is a difficult thing and most methods I've heard of are really inadequate since I think that writing code is really still more an art than a science or manufacturing system. However, LOC is still a good estimator of cost. I say this with the caveat that different s/w houses will have different correlations and that it is still stongly linked to complexity. This is why I like methods like Cocomo which attempt to relate lines produced with various drivers, both about the nature of the code and programmers involved, to produce estimates of cost and time. Also, most of these methods can be modified to reflect a particular production site. How one defines "size" I don't think is as important as how consistently and accurately it can be measured and what it is used for. To judge quality of ANY system based on its size alone is foolhardy and especially to use systems that encourage programmers to bloat their code are destructive (as Dick points out). I encourage Glen to not only check out various software economics books, but also managerial evaluation and operations research texts to determine useful ways to utilize what is collected. -- Jeffrey A. Hallett, PET Software Engineering GE Medical Systems, W641, PO Box 414 Milwaukee, WI 53201 (414) 548-5173 : EMAIL - hallett@postron.gemed.ge.com