Validator.nu Downtime

Validator.nu was down last week.

I am sorry for the trouble this has caused to the users of the service, and I will take steps to reduce the probability of prolonged downtime in the future.

In case you are curious, here’s what happened—Classic Murphy. I left for a vacation. Unknown to me a that point, the morning I left, the kernel on the virtual machine running Validator.nu killed Apache. Presumably, this happened due to an out of memory condition caused by a bad swap configuration. I blame the hosting provider for the bad default configuration: as much swap as RAM (as opposed to having twice as much swap as RAM). Not making sure that the problem was fixed was my fault, though. I had fixed the swap configuration months ago. But then I had had no reason to reboot, and I had failed to check that the configuration stays fixed at reboot. It didn’t, and I had installed a reboot-requiring kernel security update a week before leaving on vacation. That the swap configuration had gone back to bad wasn’t obvious immediately.

For some reason, these things tend to happen only at vacation time, and the whole point of a vacation is staying away from IRC and email, which means one doesn’t see the signs of things going wrong. The previous longer downtime also took place when I was on vacation and traveling. After that incident, I migrated Validator.nu to a different hosting solution in order to eliminate the root cause of the downtime incident. It happened, though, that I have been too lazy to migrate about.validator.nu, which is why it was available.