Details about yesterday's OVH outage

That is where I am stuck. I know boot storms are a problem so systems need to be turned on in an order but 99.99% of the servers i have ever touched when powered on will keep trying to start services until the succeed so dependencies aren't really an issue if the san starts up 15 min after the server.

We had servers hosted at our clients. When they lost power you just waited until it came back on and 15 min later you would login and make sure everything was up.

The companies data centers never lost power in the 20 years it has been up but that is due to every single server being plugged in to 2 100% separate power sources separate power distribution in the Data centers separate UPS separate generators separate power from separate power plants everything geographically diverse with power entering from different sides of the DC and Buildings And constant testing.

/r/sysadmin Thread Parent