3 Jun 2015
12:49 UTC
Matthew Bloch
We were surprised on Saturday by an administration failure of our main PostgreSQL server - over the last year it had become an inadvertent choke point for both monitoring and our support system, which were down for the duration of the failure (as well as the forum).
Remedial work is underway to ensure that these systems are properly separated again, that failures of any one of them will be noticed immediately by our on-call engineer, and that our monitoring system itself has a better failsafe.