UPSMON.CONF(5) Network UPS Tools (NUT) UPSMON.CONF(5)NAMEupsmon.conf - Configuration for Network UPS Tools upsmon
DESCRIPTION
This file's primary job is to define the systems that
upsmon(8) will monitor and to tell it how to shut down the
system when necessary. It will contain passwords, so keep
it secure. Ideally,only the upsmon process should be able
to read it.
Additionally, other optional configuration values can be
set in this file.
CONFIGURATION DIRECTIVES
DEADTIME seconds
upsmon allows a UPS to go missing for this many
seconds before declaring it "dead". The default is
15 seconds.
upsmon requires a UPS to provide status information
every few seconds (see POLLFREQ and POLLFREQALERT)
to keep things updated. If the status fetch fails,
the UPS is marked stale. If it stays stale for
more than DEADTIME seconds, the UPS is marked dead.
A dead UPS that was last known to be on battery is
assumed to have changed to a low battery condition.
This may force a shutdown if it is providing a
critical amount of power to your system. This
seems disruptive, but the alternative is barreling
ahead into oblivion and crashing when you run out
of power.
Note: DEADTIME should be a multiple of POLLFREQ and
POLLFREQALERT. Otherwise, you'll have "dead"
UPSes simply because upsmon isn't polling them
quickly enough. Rule of thumb: take the larger of
the two POLLFREQ values, and multiply by 3.
FINALDELAY seconds
When running in master mode, upsmon waits this long
after sending the NOTIFY_SHUTDOWN to warn the
users. After the timer elapses, it then runs your
SHUTDOWNCMD. By default this is set to 5 seconds.
If you need to let your users do something in
between those events, increase this number. Remem-
ber, at this point your UPS battery is almost
depleted, so don't make this too big.
Alternatively, you can set this very low so you
don't wait around when it's time to shut down.
Some UPSes don't give much warning for low battery
and will require a value of 0 here for a safe shut-
down.
Note: If FINALDELAY on the slave is greater than
HOSTSYNC on the master, the master will give up
waiting for the slave to disconnect.
HOSTSYNC seconds
upsmon will wait up to this many seconds in master
mode for the slaves to disconnect during a shutdown
situation. By default, this is 15 seconds.
When a UPS goes critical (on battery + low battery,
or "FSD" - forced shutdown), the slaves are sup-
posed to disconnect and shut down right away. The
HOSTSYNC timer keeps the master upsmon from sitting
there forever if one of the slaves gets stuck.
This value is also used to keep slave systems from
getting stuck if the master fails to respond in
time. After a UPS becomes critical, the slave will
wait up to HOSTSYNC seconds for the master to set
the FSD flag. If that timer expires, the slave
will assume that the master is broken and will shut
down anyway.
This keeps the slaves from shutting down during a
short-lived status change to "OB LB" that the
slaves see but the master misses.
MINSUPPLIES num
Set the number of power supplies that must be
receiving power to keep this system running. Nor-
mal computers have just one power supply, so the
default value of 1 is acceptable.
Large/expensive server type systems usually have
more, and can run with a few missing. The HP Net-
Server LH4 can run with 2 out of 4, for example, so
you'd set it to 2. The idea is to keep the box
running as long as possible, right?
Obviously you have to put the redundant supplies on
different UPS circuits for this to make sense! See
big-servers.txt in the docs subdirectory for more
information and ideas on how to use this feature.
Also see the section on "power values" in
upsmon(8).
MONITOR system powervalue username password type
Each UPS that you need to be monitor should have a
MONITOR line. Not all of these need supply power
to the system that is running upsmon. You may mon-
itor other systems if you want to be able to send
notifications about status changes on them.
You must have at least one MONITOR directive in
this file.
system is a UPS identifier. It is in this form:
[<upsname>@]<hostname>[:<port>]
Some examples:
"localhost" is the first UPS on the local sys-
tem.
"su700@mybox" is a UPS called "su700" ([su700]
in ups.conf) on a system called "mybox".
"elvis:1234" is the first UPS on a system
called elvis, which is running upsd(8) on port
1234.
To use all of the options together:
"fenton@bigbox:5678" is a UPS called "fenton"
on a system called "bigbox" which runs upsd(8) on
port 5678. Phew!
powervalue is an integer representing the number of
power supplies that the UPS feeds on this system.
Most normal computers have one power supply, and
the UPS feeds it, so this value will be 1. You
need a very large or special system to have any-
thing higher here.
You can set the powervalue to 0 if you want to mon-
itor a UPS that doesn't actually supply power to
this system. This is useful when you want to have
upsmon do notifications about status changes on a
UPS without shutting down when it goes critical.
The username and password on this line must match
an entry in that system's upsd.users(5). If your
username is "monmaster" and your password is
"blah", the MONITOR line might look like this:
MONITOR myups@bigserver 1 monmaster blah master
Meanwhile, the upsd.users on 'bigserver' would look
like this:
[monmaster]
password = blah
allowfrom = (ACLs from upsd.conf(5))
upsmon master (or slave)
The type refers to the relationship with upsd(8).
It can be either "master" or "slave". See
upsmon(8) for more information on the meaning of
these modes. The mode you pick here also goes in
the upsd.users file, as seen in the example above.
NOCOMMWARNTIME seconds
upsmon will trigger a NOTIFY_NOCOMM after this many
seconds if it can't reach any of the UPS entries in
this configuration file. It keeps warning you
until the situation is fixed. By default this is
300 seconds.
NOTIFYCMD command
upsmon calls this to send messages when things hap-
pen.
This command is called with the full text of the
message as one argument. The environment string
NOTIFYTYPE will contain the type string of whatever
caused this event to happen.
If you need to use upssched(8), then you must make
it your NOTIFYCMD by listing it here.
Note that this is only called for NOTIFY events
that have EXEC set with NOTIFYFLAG. See NOTIFYFLAG
below for more details.
Making this some sort of shell script might not be
a bad idea. For more information and ideas, see
pager.txt in the docs directory.
Remember, this also needs to be one element in the
configuration file, so if your command has spaces,
then wrap it in quotes.
NOTIFYCMD "/path/to/script --foo --bar"
This script is run in the background - that is,
upsmon forks before it calls out to start it. This
means that your NOTIFYCMD may have multiple
instances running simultaneously if a lot of stuff
happens all at once. Keep this in mind when
designing complicated notifiers.
NOTIFYMSG type message
upsmon comes with a set of stock messages for vari-
ous events. You can change them if you like.
NOTIFYMSG ONLINE "UPS %s is getting line
power"
NOTIFYMSG ONBATT "Someone pulled the plug on
%s"
Note that %s is replaced with the identifier of the
UPS in question.
Possible values for type:
ONLINE - UPS is back online
ONBATT - UPS is on battery
LOWBATT - UPS is on battery and has a low bat-
tery (is critical)
FSD - UPS is being shutdown by the master (FSD
= "Forced Shutdown")
COMMOK - Communications established with the
UPS
COMMBAD - Communications lost to the UPS
SHUTDOWN - The system is being shutdown
REPLBATT - The UPS battery is bad and needs to
be replaced
NOCOMM - A UPS is unavailable (can't be con-
tacted for monitoring)
The message must be one element in the configura-
tion file, so if it contains spaces, you must wrap
it in quotes.
NOTIFYMSG NOCOMM "Someone stole UPS %s"
NOTIFYFLAG type flag[+flag][+flag]...
By default, upsmon sends walls global messages to
all logged in users) via /bin/wall and writes to
the syslog when things happen. You can change
this.
Examples:
NOTIFYFLAG ONLINE SYSLOG
NOTIFYFLAG ONBATT SYSLOG+WALL+EXEC
Possible values for the flags:
SYSLOG - Write the message to the syslog
WALL - Write the message to all users with
/bin/wall
EXEC - Execute NOTIFYCMD (see above) with the
message
IGNORE - Don't do anything
If you use IGNORE, don't use any other flags on the
same line.
POLLFREQ seconds
Normally upsmon polls the upsd(8) server every 5
seconds. If this is flooding your network with
activity, you can make it higher. You can also
make it lower to get faster updates in some cases.
There are some catches. First, if you set the
POLLFREQ too high, you may miss short-lived power
events entirely. You also risk triggering the
DEADTIME (see above) if you use a very large num-
ber.
Second, there is a point of diminishing returns if
you set it too low. While upsd normally has all of
the data available to it instantly, most drivers
only refresh the UPS status once every 2 seconds.
Polling any more than that usually doesn't get you
the information any faster.
POLLFREQALERT seconds
This is the interval that upsmon waits between
polls if any of its UPSes are on battery. You can
use this along with POLLFREQ above to slow down
polls during normal behavior, but get quicker
updates when something bad happens.
This should always be equal to or lower than the
POLLFREQ value. By default it is also set 5 sec-
onds.
The warnings from the POLLFREQ entry about too-high
and too-low values also apply here.
POWERDOWNFLAG filename
upsmon creates this file when running in master
mode when the UPS needs to be powered off. You
should check for this file in your shutdown scripts
and call upsdrvctl shutdown if it exists.
This is done to forcibly reset the slaves, so they
don't get stuck at the "halted" stage even if the
power returns during the shutdown process. This
usually does not work well on contact-closure UPSes
that use the genericups driver.
See the shutdown.txt file in the docs subdirectory
for more information.
RBWARNTIME seconds
When a UPS says that it needs to have its battery
replaced, upsmon will generate a NOTIFY_REPLBATT
event. By default this happens every 43200 seconds
- 12 hours.
If you need another value, set it here.
RUN_AS_USER username
upsmon normally runs the bulk of the monitoring
duties under another user ID after dropping root
privileges. On most systems this means it runs as
"nobody", since that's the default from compile-
time.
The catch is that "nobody" can't read your
upsmon.conf, since by default it is installed so
that only root can open it. This means you won't
be able to reload the configuration file, since it
will be unavailable.
The solution is to create a new user just for
upsmon, then make it run as that user. I suggest
"nutmon", but you can use anything that isn't
already taken on your system. Just create a regu-
lar user with no special privileges and an impossi-
ble password.
Then, tell upsmon to run as that user, and make
upsmon.conf readable by it. Your reloads will
work, and your config file will stay secure.
This file should not be writable by the upsmon
user, as it would be possible to exploit a hole,
change the SHUTDOWNCMD to something malicious, then
wait for upsmon to be restarted.
SHUTDOWNCMD command
upsmon runs this command when the system needs to
be brought down. If it is a slave, it will do that
immediately whenever the current overall power
value drops below the MINSUPPLIES value above.
When upsmon is a master, it will allow any slaves
to log out before starting the local shutdown pro-
cedure.
Note that the command needs to be one element in
the config file. If your shutdown command includes
spaces, then put it in quotes to keep it together,
i.e.:
SHUTDOWNCMD "/sbin/shutdown -h +0"
SEE ALSOupsmon(8), upsd(8), nutupsdrv(8).
Internet resources:
The NUT (Network UPS Tools) home page:
http://www.exploits.org/nut/
NUT mailing list archives and information:
http://lists.exploits.org/
Wed Oct 16 2002 UPSMON.CONF(5)