Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!swrinde!elroy.jpl.nasa.gov!decwrl!pa.dec.com!hollie.rdg.dec.com!jch
From: jch@hollie.rdg.dec.com (John Haxby)
Newsgroups: comp.software-eng
Subject: Re: Counting semicolons (was: Re: WANTED: "C" code line counter program)
Message-ID: <1991Mar28.091725.17574@hollie.rdg.dec.com>
Date: 28 Mar 91 09:17:25 GMT
References: <RICHARD.91Mar12184440@dompap.iesd.auc.dk> <1991Mar15.132757.6883@comm.wang.com> <4196@zaphod.UUCP> <7547@idunno.Princeton.EDU>
Sender: news@hollie.rdg.dec.com (Mr News)
Reply-To: jch@hollie.rdg.dec.com (John Haxby)
Organization: Digital Equipment Corporation
Lines: 74


In article <7547@idunno.Princeton.EDU>, egnilges@phoenix.Princeton.EDU (Ed Nilges) writes:
|> Good point, Bob.  A strict metric using "statement" as defined in the
|> Backus-Naur Form definition of C would measure the following code
|> fragment
|> 
|> 
|>      index = 0;
|>      count = 0;
|>      for ( ; index<limit; index++ ) if ( condition ) count++;
|> 
|> 
|> as "larger" than                     
|> 
|> 
|>      for ( index = 0, count = 0; index<limit; index++ )
|>          if ( condition ) count++;
|> 

Actually, both of these contain six "terms" (I'm
not sure what C calls these primative units, that's
why I've put it in quotes).  In both cases the terms
are

	index = 0
	count = 0
	index < limit
	index++
	condition
	count++

If you want a metric that doesn't depend (much) on
coding style, then I suggest you count terms rather
than statements. The metric has inaccuracies when
you consider statements like

	index = count = 0;

which is still one term and

	index++ < limit;

which is also one term.

There is no absolute metric for counting useful
chunks of code (what does useful mean?), it's better
to choose a metric and know what the limitations of
that metric are than to spend weeks writing some tool
(ie a parser) that counts chunks and then falls over
in a heap because you have to run it through the C
pre-processor first.

Personally, I count semi-colons, I know it won't deal
correctly with

	if (something-failing)
		print (error),
		exit (1);

but then I'm not that bothered--I don't really regard
this is more than one chunk anyway. And I know that
the metric will differ between programmer styles, and I
don't mind that.  I don't mind these inaccuracies because
I'm only after about a 10% accuracy--I want to know
if two programs are more-or-less the same size, or
one is half the size of the other; I want to know,
roughly, what proportion of the code is non-functional
(whether comments or whitespace).  Mind you, that tends to
be clouded by whether people include RCS change histories
in their files: every change adds a minimum of three lines of comment!
-- 
John Haxby, Definitively Wrong.
Digital				<jch@wessex.rdg.dec.com>
Reading, England		<...!ukc!wessex!jch>