Xref: utzoo comp.unix.sysv386:7832 comp.unix.wizards:25438 comp.unix.internals:2714
Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!usc!wuarchive!uunet!tdmfed!federal!hecker
From: hecker@federal.uucp (Frank Hecker)
Newsgroups: comp.unix.sysv386,comp.unix.wizards,comp.unix.internals
Subject: Re: SIGPWR signal in system v
Message-ID: <1991May6.224407.22544@federal.uucp>
Date: 6 May 91 22:44:07 GMT
References: <1991May6.112253.5344@cs.tcd.ie>
Organization: Tandem Computers, Inc.
Lines: 99

In article <1991May6.112253.5344@cs.tcd.ie> ohurley@cs.tcd.ie (Oisin
Hurley) writes:

>Does anybody out there have information on the SYS V SIGPWR signal? The man
>says that this signal occurs if there's a power failure. I have a couple of
>questioned which I haven't been able to answer myself:

I have no idea how widely this signal is implemented, but it's
available on the Tandem Integrity S2 fault-tolerant system, which has
built-in batteries sufficient to keep the system up for several
minutes.  The S2 implementation provides a good example of how SIGPWR
could be used in any system with an well-integrated uninterruptible
power system.

>1. When power goes, is this signal sent to every process currently running?

On the S2, it depends on how you've configured your system.  If you've
configured it to do a shutdown, then the standard shutdown procedure
is followed and SIGPWR is not used; processes are sent SIGTERM instead.

On the other hand, if you've configured it to do a powerfail auto
restart then (almost) all processes get sent SIGPWR.  They then get
suspended and the contents of system memory written to disk, after
which the system turns itself off.  When power resumes the contents of
memory are reloaded from disk, the processes are resumed, and they're
sent SIGPWR again.

The reason for the "almost" above is that you can set selected
processes to not resume across a power failure (e.g., to avoid
security problems with open sessions).  On power failure these
processes get sent SIGTERM instead, followed by SIGKILL.

Also, since processes get sent SIGPWR twice (once before going down
and once after coming back up), additional information must be
included with the signal to indicate what's happening.  This is done
using an integer code passed as the second argument to the signal
handler.

>2. How far does the power have to drop before the signal is activated?

On the S2, below the lower voltage limits of the system, at which
point the batteries start supplying power to the system and the kernel
is notified.

>3. How long does the power have to stay at that level to ensure activation?
>	(how about transients, etc.)

On the S2 this is configurable.  The standard figure is 15 seconds,
but 30 seconds or even a minute are also reasonable values.  If power
is restored during that period then the kernel takes no action and
applications are not affected.

>4. Is the signal generated on the mboard or is there a line from the psu?

The kernel is interrupted by the S2's I/O processor, which samples
environmental info like power and temperature.  (For example, the
system can survive the failure of one fan, but the failure of two fans
is treated like a power failure.)  I'm not familar with the exact
hardware details.  The kernel then starts checking every five seconds
to see if power is still off, until either power is restored or the
allotted time period elapses.

>5. Has anybody used it - is it useful? Is there enough time to sync disks
>   upon receipt of SIGPWR (I presume there's hardly time for anything)?

I've written demo programs to catch SIGPWR on the S2, but haven't used
it in an actual application, nor have I modified any freeware programs
to support it.  I leave it to others to propose applications for which
it would be most useful.

By default S2 processes get 30 seconds to do SIGPWR handling before
they're suspended.  This can be decreased or increased by the system
administrator.  (Disks get synced in any case, whether the system
administrator has configured for a shutdown or for a restart.)

Handling SIGPWR correctly in the general case is quite tricky, since
you have to account for interrupted system calls, date/time changes
across the power failure, and the like.  There's a whole section on it
in the Integrity S2 Programmer's Guide, with an example I still
haven't been able to puzzle out fully.

There are also some problematic issues from the system point of view.
For example, if the kernel starts a power fail shutdown it has to
finish it even if power comes back on during the shutdown.  (In this
case the system reloads memory after it finishes dumping it.)

>6. Why is it there?

It's a very useful signal to have if your system is supported by a UPS
that can keep it up for a long enough period of time for appropriate
actions to be taken.  It's even more useful if your system can resume
applications across a power failure, like the S2, as opposed to just
initiating a standard shutdown.

I'm interested in hearing about any comparable implementations of
SIGPWR on other systems.
-- 
Frank Hecker
...!uunet!tdmfed!hecker