Friday, February 23, 2018

Power Failure in datacenter

Hello there,

This morning I got reported that SetCronJob had been down from Feb 23, 18:03 to 19:16, UTC. After some research, I found that the power failure in Fremont datacenter at Linode caused the issue. You can read more about the incident here.

Because all of our core servers (database, web, cron distributor, etc) are hosted in Fremont, they had crashed and stopped working entirely for more than one hour. With our current configuration, there was nothing we could do about it. We can only run the cronjobs with missing executions once to make sure they won't miss.

We are sorry for the inconvenience.

We are working on improving our system to prepare for incidents like this.