As I am writing this post Wednesday morning, several Oakley Studio client web sites are unreachable. (Don’t worry — they’ll likely be back by the time you read this.) These sites became unreachable because Domain Name System (DNS) servers providing essential domain name routing information have gone missing.
This is a Twitter success story as much as it is about DNS failure.
Part of my job as webmaster is to manage your web site’s domain name. For many years I’ve contracted a third-party DNS service provider to handle the job of converting domain names into IP addresses, in order to route visitors to your web site on the Oakley Studio web server, and ensure that your inbound and outbound email can find the Oakley Studio email server. That company — ZoneEdit, Inc. — appears to be failing badly.
Occasional DNS server outtages are to be expected from time to time. Redundancy is what keeps web sites from disappearing when that happens. There are always at least two DNS servers providing routing data for any web site. So if one of those servers is down temporarily, the second DNS server remains available to provide essential routing data.
What ZoneEdit appears to be experiencing is a cascading failure.
But I’m getting ahead of myself… This story actually began two weeks ago, when one of my clients called to report that their web site was down. I did some checking — I was also not able to get to the web site, but other sites hosted on the Oakley Studio web server were coming up ok, so the server was fine. The next possibility was an expired domain name, but I was pretty sure I hadn’t let that happen. A quick check with our registrar, Network Solutions, indicated no problem there.
Next I began exploring whether DNS was malfunctioning. I queried my ISP Earthlink DNS server and found that it had old data! It was pointing to where the Oakley Studio web server had been located up until late June of this year, before I moved the server to a colocation facility in Wisconsin.
Bad DNS data. That explained why this client’s site was unreachable. But where was the old data coming from? I logged into my ZoneEdit dashboard and checked the domain name records there. Everything looked good, but I republished the records just in case (pushing them out to the distributed DNS network) in hopes it might resolve the problem. And apparently it did — within a few minutes, my client’s web site was back. I submitted a Support Request to ZoneEdit about this strange occurrence and asked them to check into it; but I received no word back. That was odd.
A few days later another client was having trouble with email delivery. This time the problem seemed to resolve on its own before I was able to pinpoint the cause. But I suspected DNS — something peculiar was happening with ZoneEdit’s DNS servers.
And then three days ago I received a call on Sunday afternoon; another of my clients reporting he was unable to bring up his web site. I checked my ZoneEdit dashboard, and 5 of their DNS servers were down. Both designated DNS servers for this client’s web site were offline! I submitted an urgent support request to ZoneEdit, and then began working to migrate DNS service to a second provider.
First I put out a request on Twitter, and within a few minutes got back several recommendations of other DNS service providers. I opened an account with Dyn in Manchester, New Hampshire. They offer a free 30-day trial. Unfortunately, it was Sunday, and a welcome email from Dyn informed me the new account would be provisioned (DNS servers assigned to my account) during normal working hours the next day.
About an hour later I received another message on Twitter from Carl Levine, a “concierge technician” at Dyn, saying he had pushed through the provisioning and my account was ready to begin transferring domain names. Turns out he was the one who had recommended Dyn an hour earlier. He had been following my conversation, caught my mention of account setup with his company, and expedited the account provisioning within the hour — and on a Sunday! Wow, that is FANTASTIC customer service! So within a surprisingly short time my client’s site was back again, with DNS service now provided by Dyn.
Meanwhile it appeared that ZoneEdit’s problems were worsening. Yesterday evening 7 out of 13 ZoneEdit DNS servers were down. The graphic to the right is what I saw yesterday in my ZoneEdit dashboard. Green indicates a server is up and running, red means a server is unreachable. After more than a decade of reliable service to Oakley Studio and its clients, ZoneEdit appears to be having some severe technical difficulties.
I could not wait and hope their problems would be resolved — I moved most of my other client’s domains to Dyn and am moving the rest this morning. Not simply because those servers are still down 12 hours later, but because I still have received no response from my three support requests to ZoneEdit, and that’s just unacceptable.
Do a search for “ZoneEdit” today on Twitter and you’ll see how much grief the ZoneEdit server outtage is causing. Another customer tweets woefully:
“Its one thing to have an outage, even one this long but how do you explain not giving a simple status update to all your customers?”
Over half the ZoneEdit servers are still down as I finish this post, and the specific nature of the problem remains unknown.
Some interesting follow-ups to this story:
- ZoneEdit’s problems began shortly after a major DNS outtage at domain registrar GoDaddy. That outtage took millions of web sites down for about a day. Are the ZoneEdit probs related to that GoDaddy failure?
- Turns out Dyn is a DNS service provider for, get this… Twitter!