« Best Practices for Speeding Up Your Web Site | Main | Architecture »

DNS-Record TTL on worst case scenarios

i didnt find a nearly good solution for this problem yet:

imagine, you're responsible for a small CDN network (static images), with two different datacenter. the balancing for the two DC is done with a anycast nameservice (a nameserver in every DC, user gets on nearest location). so, one of the scenario is that one of the datacenters goes down completly. you can do a monitoring on the nameserver and only route to the dc which is still alive, no problem. But what about the TTL from the DNS-Records? Tiny TTLs like 2 min. are often ignored by several ISP (e.g. AOL). so, the client doesn't get the IP from the other Datacenter. what could be a solution in this scenario?

Reader Comments (4)

Perhaps you should look into the possibilities of Global Load Balancing? I believe this is one of the issues this category of products tries to solve.

Here's a link to a whitepaper describing how a specific commercial global load balancer works:

November 29, 1990 | Unregistered Commenterjab

Hi jab,

thanks for your Response. I find several Hardware-Balancing Systems, even for Global Balancing (also CISCO ACE). The Problem is, a worst case scenario is the Datacenter goes _completly_ down (e.g. a Earthquake or so). So, these systems are not available, but the other dc is still available. How could i inform the Clients with a cached IP to request the Hostname another time?

Maybe one Idea to solve this scenario is to announce the complete Network-Addresses from the broken DC to another DC, but this solution is very complicated and I even don't know how to get this running.

November 29, 1990 | Unregistered CommenterAnonymous

just another note, Zeus also relies on DNS-Record-TTL, see Page 9. But they don't write that the TTLs are sometimes ignored :(

Zeus can also handle BGP Routing Control, but as they say in the PDF (Page 7) "its expensive and coarse to provide fine-grainend load balancing control"

November 29, 1990 | Unregistered CommenterAnonymous

Those hosting sites in EC2 deal with this problem because they have to use DNS to failover when their instances go down. It appears the problem is made worse because web browsers (IE) cache IP addresses for longer than the TTL.

November 29, 1990 | Unregistered CommenterTodd Hoff

PostPost a New Comment

Enter your information below to add a new comment.
Author Email (optional):
Author URL (optional):
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>