This is where we should keep track of info about the power supplies, power usage, etc.
Most of our servers have two power supplies. If there is a steady beeping noise in the server room, it's likely that, on at least one server, one of the two power supplies is no longer drawing current (but the other is).
Power supply units
We have several layers of power supply hardware, some with software interfaces and others not.
We get power from the usual Earlham electric grid (hereandafter called "wall"), and we don't manage that much. Reach out to facilities with issues.
We also have an internal backup power supply system in case wall power goes down. It's a large power supply unit with several batteries of cells (which we replace every 3-4 years, last as of this writing was 2018-19 year). This unit is mounted to the bottom of the Arctic rack.
That larger system feeds a satellite power distribution unit at the very bottom of the Antarctic rack, feeding the cluster machines and others on that rack. Each set of four outlets has its own circuit breaker. If power goes down for a subset of the Antarctic rack (i.e. some outlets are energized and others aren't), check these circuit breakers first.
Each of those power units feeds a series of power distribution units that act as industrial-scale power strips, carrying power to devices and in some cases providing surge protection.
The daemon apcupsd looks useful for controlling the power supply.
The UPS web interface is only accessible using lynx with the IP address 188.8.131.52 directly from the console in the server room, all other attempts to connect from other machines are blocked by the firewall.
Our installation from source is incomplete. Steps are completed up to the point where we modify the conf file.
SNMP tools are enabled. Info here.
Some snmpget commands
Here are some common OID's. SNMP uses these OID's to get very specific, discrete pieces of information from a device.
As an example, we've run the following commands to get info about the power supply for cluster:
> /usr/bin/snmpget -v 1 -c public ups.cluster.earlham.edu .184.108.40.206.4.1.3220.127.116.11.18.104.22.168 SNMPv2-SMI::enterprises.322.214.171.124.126.96.36.199 = STRING: "Smart-UPS RT 10000 XL" > /usr/bin/snmpget -v 1 -c public ups.cluster.earlham.edu .188.8.131.52.4.1.3184.108.40.206.220.127.116.11 SNMPv2-SMI::enterprises.318.104.22.168.22.214.171.124 = INTEGER: 9 > /usr/bin/snmpget -v 1 -c public ups.cluster.earlham.edu .126.96.36.199.4.1.3188.8.131.52.184.108.40.206 SNMPv2-SMI::enterprises.3220.127.116.11.18.104.22.168 = Timeticks: (132000) 0:22:00.00 > /usr/bin/snmpget -v 1 -c public ups.cluster.earlham.edu .22.214.171.124.4.1.3126.96.36.199.188.8.131.52 SNMPv2-SMI::enterprises.3184.108.40.206.220.127.116.11 = Gauge32: 100 > /usr/bin/snmpget -v 1 -c public ups.cluster.earlham.edu .18.104.22.168.22.214.171.124.0 SNMPv2-MIB::sysLocation.0 = STRING: Noyes Machine Room, Arctic Rack > /usr/bin/snmpget -v 1 -c public ups.cluster.earlham.edu .126.96.36.199.4.1.3188.8.131.52.184.108.40.206 SNMPv2-SMI::enterprises.3220.127.116.11.18.104.22.168 = Gauge32: 26