5 Sep 2016
18:36 UTC
Bytemark Engineer
A design-flaw in our DNS-processor meant that handling excessive amounts of bogus data caused significant delays to the processing of stored records, as soon as the server started it would stall almost indefinitely processing the existing data before attempting to process new additions. The delay resulted in future uploads being queued, rather than processed immediately, further contributing to the load.
Once the pending-uploads became large enough further uploads were disabled, to prevent the problem from becoming even more acute, this manifested itself in uploads failing with "server limit reached".
Once the source of the excessive (and bogus) data was identified it was removed, but a more robust solution will need be implemented as soon as realistically possible to prevent a future recurrence of the problem.
A reboot of the uploads processing script was required. This caused further problems because the system has grown to be much bigger than the programmers originally expected and there was a number of performance problems starting up the process. The source code had to be modified in order to bring the process back online within an acceptable timescale. The code will be under further review pending upgrades to help it deal with large amounts of data and recover from downtime more quickly in future.
5 Sep 2016
17:11 UTC
Telyn Roat
At around 2:00 PM BST some of our customers reported failures while attempting to upload new DNS data to upload.ns.bytemark.co.uk. We are currently investigating this problem.
Our authoritative nameservers are unaffected and continue to serve up data they had prior to uploads occuring after approximately 1:30PM.