Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!mailrus!uunet!dev!dgis!jkrueger From: jkrueger@dgis.dtic.dla.mil (Jon) Newsgroups: comp.databases Subject: Re: Performance Data (was Re: Client/Server processes and implementations) Message-ID: <684@dgis.dtic.dla.mil> Date: 1 Dec 89 20:17:02 GMT References: <7169@sybase.sybase.com> <13520006@hpisod2.HP.COM> Organization: Defense Technical Information Center (DTIC), Alexandria VA Lines: 112 dhepner@hpisod2.HP.COM (Dan Hepner) writes: >From: jkrueger@dgis.dtic.dla.mil (Jon) >> >> >1. Is it your experience that more than 10% of the work is done by >> > the clients? >> >> Sometimes. If it's only 10%, we may then assign 10 clients per server, >> thus balancing the load. Yes, the server load increases too, but not >> proportionately; balance might be 12 or 15 clients per server. >In the example, if one moved 10 clients taking 10% of a 100% used CPU, >we would simplistically end up with the client CPU 10% used, and >the server CPU still 90%. Perhaps I'm not making myself clear. That's 10% per client. 10% of the work is done by the client; this client serves a single user. Each additional concurrent user gets another client, which consumes another 10%, in this example. >Adding one more client, we would end up with >a saturated system with 11 Clients on an 11% utilized client machine, >while the server was now 99% used. If this were so, it wouldn't >seem either all that balanced, and probably a economically unjustifyable >move. All you're saying is that a two-process model doesn't scale well if we're already bottlenecked on either process. This is a tautology. >100+% increase in hardware cost yielding a 10% increase in >throughput. Indeed, it's worse than that: the interconnects aren't free. One doesn't win by distributing inherently sequential problems that one doesn't know how to decompose. Again, a tautology. >> >2. Is it your experience that remote communication costs don't end >> > up chewing into the savings attained by moving the clients >> > somewhere else? >> >> No, the lower bandwidth is more than offset by multiprocessing. >Let's assume you have plenty of bandwidth, but not plenty of CPU >cycles at the server. Remote communication, especially reliable remote >comm, being more expensive than local communication. In exactly the same way that reading bytes off disks costs more cycles than referencing memory, yes. But compelling cases for not requiring databases to reside in main memory can be made, no? >The extreme of my >concern would be illustrated if the remote communication costs at the server >end exceeded the processing/terminal handling done by the client, >in which case one would actually lose by adding a remote machine >for the clients. A valid concern. Got any data? Measured degradation in latencies? Throughput? I don't deny it can happen, just asking how often it does. And again, you're simply saying that sometimes costs of distributing the load are greater than benefits achieved. How true: sometimes the problem is intractable, or you don't know enough to decompose it, or your tools are poor, or the implementation is poor. Then you get the biggest monoprocessor you can afford, indeed. You've admitted you can't work smarter, so you'd better work harder. >> >>(and in the extreme (and not at all impractical) case, you run each >> >> client and each server on its own machine). This model is simple, >> >> elegant, and fundamentally right. >> >> This isn't the extreme case. Multiple processors can divide work >> with better granularity than client and server processes. >Maybe you can clarify. The case in question was how frequently it would >practical to put each client and each server on its own machine, with >the assertion that if the client/server workload split weren't near >50-50, it wouldn't be practical. The usual assumption is that each client can get its own machine, but the server has to share a single machine. This makes the server the bottleneck, in general. It's also a bad assumption: multithreaded servers can use multiprocessors to scale up, distributed DBMS can use distributed hosts to execute queries, and parallel servers can apply processors to each component of each query. The first two animals exist now. >The points of confusion: > 1) "Multiple processors" can be ambiguous as to remoteness, but given > the context I'll assume remoteness. (right?) Wrong, as in previous graf. > 2) Granularity. Are you postulating a flexible division of the work > between client and server? A server which is flexibly divisible > over both machines? Nope, a flexible approach to designing database engines. Remember, your query language can't tell the difference anyway. >I think all of these questions are facets of the same underlying question: >how much of the typical application can be done at the client? Fair question, but needlessly special. The general question is how can we divide up work, and what tools do we need, and how many of them exist yet? -- Jon -- Jonathan Krueger jkrueger@dtic.dla.mil uunet!dgis!jkrueger Isn't it interesting that the first thing you do with your color bitmapped window system on a network is emulate an ASR33? Brought to you by Super Global Mega Corp .com