Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!cs.utexas.edu!tut.cis.ohio-state.edu!ucbvax!hplabs!hpda!hpcupt1!hpisod2!dhepner
From: dhepner@hpisod2.HP.COM (Dan Hepner)
Newsgroups: comp.databases
Subject: Re: benchmarks
Message-ID: <13520005@hpisod2.HP.COM>
Date: 30 Nov 89 23:22:18 GMT
References: <2715@infmx.UUCP>
Organization: Hewlett Packard, Cupertino
Lines: 82

From: bgolden@infmx.UUCP (Bernard Golden)
> 
> An organization called the Transaction Processing Performance Council has
> been addressing this problem by developing a general benchmark that will 
> not be subject to individual test tweaking.  Instead, the benchmark is
> very specifically defined and no modification will be allowed.

I'm with you Bernard in wishing this were true, but it can't be so.

Benchmarking heterogeneous systems will always be subject to individual
test tweaking.  Witness dhrystone.  And TPC-A will not include source
code (how could it).

Benchmarking is a game.  Like football.  The goal in this game is to
figure out how to show the most Transactions Per Second (TPS) while
still meeting the rules.  Clever game players will be rewarded;
whether or not this cleverness is of value external to the benchmark.
One is reminded of the hypothetical C compilers we heard about which 
checked every source file to see if it were the dhrystone benchmark, 
and if so, generated code which showed extreme speed.  It won't
be quite like that, but we are going to see some "features" which
are useless unless you want to run the TPC benchmark.

Some cases in point:

1) If you use a large file as part of your application, does that
   file have an index?  Using an indexed access method to generate
   TPS #s will be simply stupid.  It isn't required.  Use a hashed
   access method.  The fastest you can imagine for a file of just this
   _fixed_ size and composition.  Spend as long as you need to find
   the perfect scheme for this particular file. It might take a week 
   to add a new record?  No problem.  Not too usable for any known
   customer? That's ok.

2) None of the transactions used to calculate TPS will ever abort.
   Ah ha.  Does this mean I can use an extremely optimistic logging
   scheme which would take five minutes to straighten up the mess
   if a transaction actually _did_ abort?  You betcha.  The rules say 
   that such a feature must be available to customers, but not how 
   many of them might use it.

3) The transaction simulates the modification of a customer bank
   account record, but no provision needs be made for a customer
   who might exceed his authority to withdraw the specified amount.
   This allows an optimization which allows the
   backend to proceed to complete the entire transaction without
   any communication with the front end.  There's no if(reasonable) 
   part of the txn.  How many transaction applications do you know
   that submit one request to the backend, get an the answer,
   and have completed an entire transaction which modified three
   files and appended to a fourth, never having checked anything?

4) SQL required? Are you kidding?  Any DBMS at all?  Define
   DBMS. TPC Didn't even try. Almost any ad hoc access method will do.

This isn't intended to be critical of the TPC.  Their charter
was to define a benchmark which was runnable on many different types
of machines and to not preclude as yet unknown DMBS architectures.
The benchmark is far better than "Debit-Credit"  or "TP1" (which have the
above problems and much, much more).  The largest file _does_ have
to be too big to be cacheable (10 MB / TPS, implying 1GB for 100 TPS).
The Unit Under Test must receive input from and send output to
_somewhere_. Real "ACID" transactions must be used.  "Full disclosure"
is required.

It's just that benchmarking for publication isn't any more than a game.  
It cannot be used to reliably compare the predicted performance of two 
systems on a given application load.  And not just by a little, which
is where mistakes are easy to make.  "Well, we'll knock that estimate
down by 50% because our txns are a little tougher".  Clever, not-all-that-
valuable-in-the-real-world techniques can increase a score by an order 
of magnitude.

This isn't a claim that all published numbers will use non-general
techniques, but it may be a suggestion to investigate those high TPS
numbers we'll see with skepticism that the system described could
run _your_ don't-seem-that-big transactions even 1/10 that fast.

Dan Hepner

Disclaimer: I don't work on any of HP's DBMS products at all, and 
            certainly don't speak for HP with this opinion.


Brought to you by Super Global Mega Corp .com