Path: utzoo!utgpu!news-server.csri.toronto.edu!rutgers!tut.cis.ohio-state.edu!zaphod.mps.ohio-state.edu!wuarchive!decwrl!infopiz!athertn!hemlock!mcgregor From: mcgregor@hemlock.Atherton.COM (Scott McGregor) Newsgroups: comp.software-eng Subject: Re: recap so far Message-ID: <27245@athertn.Atherton.COM> Date: 17 Jul 90 23:22:03 GMT References: <27199@athertn.Atherton.COM> <1990Jul10.134226.22459@iti.org> <268462df> <39400111@m.cs.uiuc.edu> <31558@cup.portal.com> Sender: news@athertn.Atherton.COM Reply-To: mcgregor@hemlock.Atherton.COM (Scott McGregor) Organization: Atherton Technology -- Sunnyvale, CA Lines: 118 Some further comments concerning Cliff's recap. In general I agree with the facts that Cliff records. I have some differing interpretations of what should be deduced from them. Cliff writes that software engineering prediction models >... attempt to create an accurate measure of something that is not > measurable in the first place. Software development time is surely measurable. We measure it after the fact every time we ship a product. At best what Cliff is saying is that they attempt to give an accurate PREDICTION of something that is not PREDICTABLE in the first place. This is more correct, but truthfully everyone who builds tools for predictions of this sort is usually quite upfront about the fact that they create predictions of MEANS with some VARIANCE. Often, the tools will either show you the variance, or if you consult the original research data on which the tool was built you can get the underlying measures of variance. As Cliff points out these levels of variance are extremely high. In that sense you won't get a very accurate prediction. However, the say that the tools attempt to give an accurate prediction is overstating things. The tools attempt to merely give the MOST accurate prediction possible, admitting that this prediction is not very dependable. It may not be accurate, but it is a better prediction than one might arrive at with no data. Clearly, software schedules are predicted by people, and so ipso facto are predictable, it is merely that the predictions may not be very accurate. > Such estimates are bound to be useless because numerous coding difficulty > assumptions will be wrong without support of actual time measurements. > These errors propogate in an exponential manner throughout layers of code. Cliff correctly identifies a chief source of the lack of accuracy. Software, like other chaotic dynamics processes, seems sensitively dependent on such specific initial starting conditions that it is inherently impossible to predict future states with certainty. However, his characterization of such estimates as useless is an overstatement. In the game of blackjack (21), the odds can be calculated concerning successful draws if you hand contains 13 points, 15 points, 19 points, etc. When you know the odds, you do not know accurately what the outcome of the next draw will be. But knowing (and playing) the odds can be better than ignoring them in the long run. The odds aren't useless merely because they don't offer certainty for each draw. Card counters, can do even better because they have more accurately odds estimates, but they don't have better certainty over the next particular draw. So there is even value to more accurate estimates, even when you still have lots of room for error. > It is NEVER right to design a software system ONLY on paper for the purpose > of devising an estimate, unless you can be happy with an estimate that > may be way off. The point is that some people can be MORE happy with an estimate that is LESS way off, even if it still is far from perfect. The alternative, NO estimate, is often psychological unacceptable to risk averse persons. They might not like the amount of variance in the current estimate; they might want more certainty, but some estimate is better than none. > Lets discuss hardware estimates. > Do such estimates allot a fixed amount of time for each chip to > to estimate a board design (eg 80386 = 2 weeks, etc.) I have created and maintained support systems for HW designers. The answer, is that in a manner of speaking they DO make estimates based on numbers of elements. IC designers frequently composed their designs by putting together numbers of pre-defined elements: so many gate-arrays of a certain sort, some I/O buffers, so many registers, an arithmetic unit that does such and such. Board level designers would then say so many RAMs, a such and such CPU, this sort of I/O processor, a memory manager, floating point chip etc. System designers would design with thus and so bus, a motherboard, various I/O and memory boards, etc. First estimates for how long these projects would take were often derived merely from the total numbers of elements to be used at any given level. Hardware design estimates, especially early ones, were often inaccurate. But, just as software estimates, they improved as more of the project was completed. However, I believe that the hardware designers had an advantage, and that this advantage yielded some important reduction in variances. In general, the amount of a complexity on a chip is roughly about the same as the amount of complexity of another chip of the same type in the same era. Similarly with boards and systems. Also interestingly, the amount of complexity on a board is typically not much more complex than the underlying chip if you treat each component on the board as a black box the way you might treat a gate on a chip as a black box. And it is not easy to visually confuse a chip, a board and a system. Software components on the other hand tend to vary greatly in internal complexity, in ways that are not at all apparent from their external interfaces. Thus if you are making estimates about complexity from the external interfaces (or requirements specs) you don't have the same level of strong relationship with the internals that you do with a chip or board or system. So you are unable to reduce variance as much. For this reason, many estimation tools such as COCOMO allow you to add additional data about "expected complexity" of the module to be designed. But these are less precise relationships than the physical constraint relationships of chips and board dimensions. I believe that it is the power of constraint relationships to predict complexity that accounts for why despite everything lines of code has usually the strongest relationship to development time of any typically measured predictor variable. Scott McGregor mcgregor@atherton.com