Home
South Pole Logbook

Search below for 'logbook_sop' for help on usage.

Sections

Search

Archives

November 2009
Sun Mon Tue Wed Thu Fri Sat
         

RSS Feed

Powered by Blosxom


May 18, 2009

Power outage


Author: Erik
Mood: Excellent !

So here is the situation: Almost the whole rack 13 was OFF. Just OFF. The concerned machines were: * sps-fpslave03 * sps-fpslave04 * sps-fpslave05 * sps-fpslave06 * sps-fpslave07 * sps-fpslave08 * sps-fpslave09 * sps-fpslave10 * sps-fpmaster * sps-cvmaster Only sps-fpslave01, sps-build32, and all the blades were ON when I arrived. In addition, sps-itfreeze in rack 13 was OFF too. It is not the first time that this one powers OFF when a glitch appears somewhere else in the ICL. All machines are ON again, they are booting. Since these machines were powered OFF in a rather brutal way, they are now performing checkdisks, which takes a long time. I can't access sps-fpslave 05 and sps-fpslave06 with the KVM console, but they are up and running. Here is what most likely happened. Both UPSes in rack 16 are dead. A red light is ON on the upper one, and a red led is blinking on the lower one. This a pretty bad news, because the only spare UPS we had was moved to rack 9 last week to replace another failed one. According to the manual, the batteries are dead, and this is confirmed by the HP Power Management software. In the logs, on May 18 appeared (chronologically): * 15:10:27 The UPS is in a low battery condition * 15:10:27 The UPS is on battery * 15:10:34 The UPS battery has failed * 15:10:34 The UPS low battery condition has cleared * 15:10:34 Utility power has been restored * 15:12:46 A UPS battery has failed * 15:20:14 A UPS battery has failed * 15:20:16 Connection with the remote agent was lost (probably when the server was powered off) On another UPS I can see that there has been a little power cut for a few seconds. The good news is that rack 9 is still running. I'll talk to the power plant manager tomorrow morning, to confirm there was a problem.


Erik.

Erik Verhagen | 18 May 2009 11:56 GMT | Ice Cube/Power | | permalink