Path: utzoo!utgpu!news-server.csri.toronto.edu!rutgers!att!cbnewsl!sar0
From: sar0@cbnewsl.att.com (stephen.a.rago)
Newsgroups: comp.unix.wizards
Subject: Re: Streams message allocation
Keywords: streams deadlock
Message-ID: <1990Aug2.043059.578@cbnewsl.att.com>
Date: 2 Aug 90 04:30:59 GMT
References: <10332@celit.fps.com>
Distribution: usa
Organization: AT&T Bell Laboratories
Lines: 61

In article <10332@celit.fps.com>, hutch@fps.com (Jim Hutchison) writes:
> I have been reading through the systemVr2 streams porting doc for the 3b2,
> in an effort to better understand the streams code in Suns 4.1 source.

I hope you mean SVR3 because streams wasn't in SVR2.

> I noted that there is some logic to avoid deadlocks which result from all
> the message buffer space being in use.  There is a pool, and a certain
> amount which is reserved for high-priority messages.

It's not for deadlocks.  It's because recovery from message allocation
failure is clumsy at best and necessitates either a delay or loss of data,
so the caller indicates how badly the message is needed.

> Can someone with real-life experience tell me what the relationship is
> between activity and the proportion of the size of the pool to the size of
> the reserved high-priority part?  I'd imagine that by counting the high
> priority sections (areas that can't sleep) which allocate buffers per
> second, I'd get the number of high-priority buffers that would have to be
> available in that second.  This number would rely on certain factors such
> as the message rate and system load as seen by the streams.

The percentages are tunable via the kernel master file, so administrators
can modify the cutoff where failures start occurring.  I've never heard
of a system where the defaults were not used.  BPRI_LO is used by the
stream head during writes because the stream head can easily sleep.
BPRI_MED is used by modules and drivers for most types of messages.
BPRI_HI is used by modules and drivers for high-priority messages.
It is not expected that many high-priority messages would be needed
under normal circumstances, but again, these are cases where the caller
probably doesn't want the delay and doesn't want to forget the event.

The number of messages needed per buffer class (size) is something that
can only be determined statistically or empirically.  In the absence of
information like inter-arrival rate of allocation requests and message
hold times, the best way to proceed is to start with an educated guess
of how many messages you may need.  For example, if you have one ethernet
card, you know that packets are probably going to be using the streams
buffers in the 2K range and the 64 or 128 byte range.  You also know
that the ethernet driver can afford to drop a few packets on the floor,
but you'd like to avoid this situation.  You can configure lots of
streams messages, but you'll be ultimately limited by the number of
buffers on the I/O card and the transmission rate of the media.  You
also need to consider that the more memory you devote to streams messages,
the less memory will be available for other things in the system, and
paging activity will increase.  You might start with 32 2K buffers and
say, 64 64-byte buffers and 64 128-byte buffers.

The next step is to run your workloads and then use the strstat
function of crash(1M) to see if there were any allocation failures
for any buffer sizes.  These will need to be increased.  If there are
any classes where the maximum number of buffers in use at one time was
far less that the number configured, then you can decrease the number
allocated for that class.

Some people at AT&T were using an Erlang traffic model to configure
their streams buffers, but ever since SVR4, it's become a moot point.
(They're allocated dynamically in SVR4).

Steve Rago
sar@attunix.att.com