« Podcast about Facebook's Cassandra Project and the New Wave of Distributed Databases | Main | eHarmony.com describes how they use Amazon EC2 and MapReduce »

Hot New Trend: Linking Clouds Through Cheap IP VPNs Instead of Private Lines 

You might think major Internet companies have a latency, availability, and bandwidth advantage because they can afford expensive dedicated point-to-point private line networks between their data centers. And you would be right. It's a great advantage. Or it at least it was a great advantage. Cost is the great equalizer and companies are now scrambling for ways to cut costs. Many of the most recognizable Internet companies are moving to IP VPNs (Virtual Private Networks) as a much cheaper alternative to private lines. This is a strategy you can effectively use too.

This trend has historical precedent in the data center. In the same way leading edge companies moved early to virtualize their data centers, leading edge companies are now virtualizing their networks using IP VPNs to build inexpensive private networks over a shared public network. In kindergarten we learned sharing was polite, it turns out sharing can also save a lot of money in both the data center and on the network.

The line of reasoning for adopting IP VPNs goes something like this:

  • Major companies are saving 1/4 to 1/2 of their networking costs by moving from private lines to IP VPNs. This does not even include the benefits of lower equipment costs (GigE ports are basically free) and more flexible provisioning (any-to-any connectivity, easy bandwidth dialup).
  • Cheaper comes with a cost. Private lines are reliable. The Internet is inherently unreliable, especially when two endpoints are linked by potentially dozens of routers in between. In particular Internet connections suffer from: 1) dropped packets 2) out of order packets. Statistically this may happen for only 1% of packets, but when it does the user experience plummets. To get a feel for the impact imagine you have a 200ms latency link to Europe and you're trying to do something interactive. Lose a packet and you'll have to wait for a retransmission which will take at least 1 second. So IP VPNs can provide an order of magnitude more bandwidth for less money, but they often have less actual throughput and reliability.
  • Since latency and quality are so important to Internet companies, how can they possibly afford to use IP VPNs? They cheat. They fix the IP connection by using WAN accelerators.
  • WAN accelerators are typically thought to be mostly about caching, but they can also can trick TCP into giving a better connection even over unreliable networks. It's like wearing corrective lenses for your network. And that's what you need when dropping dedicated lines for Internet connections.
  • Relatively inexpensive WAN accelerators can turn somewhat unreliable Internet connections into a very reliable cost effective connection option. Your customers won't believe it's not butter.
  • The result: lots of money saved and a quality costumer experience.

    We take TCP for granted so to learn it has this unsightly packet loss/delay problem is a bit unsettling. But here's the impact packet loss has on throughput:
  • Latency: 100ms, Loss: 1%, Throughput: 1.2 Mbps
  • Latency: 200ms, Loss: 1%, Throughput: .6 Mbps
  • Latency: 100ms, Loss: .5%, Throughput: 1.7 Mbps

    These numbers are independent of your WAN link capacity. You could have an 100Mbps link with 1% loss and 100ms latency and you're limited to 1Mbps!

    The reason why we have this bandwidth robbing state of affairs is because when TCP was designed packet loss meant network congestion. The way to deal with congestion is to stop sending data in order to avoid congestion. This drops throughput drastically for a very long time. Over long distance WAN connections packets can be delayed which seems like a packet loss which causes congestion avoidance measures to kick in. Or maybe only a single packet was dropped and that kicks in congestion avoidance.

    The trick is convincing TCP that everything is cool so the full connection bandwidth can be used. WAN accelerators have a number of complex features to keep TCP happy. Damon Ennis, VP Product Management and Customer Support for Silver Peak, a WAN accelerator company, talks about why clouds, IP VPNs, and WAN accelerators are a perfect match:
    Moving applications into the cloud offers substantial cost savings for enterprises. Unfortunately those savings come at the cost of application performance. Often performance is so hampered that users’ productivity is severely limited. In extreme cases, users refuse to utilize the cloud-based application altogether and resort to old habits like saving files locally without centralized backup or returning to their old “thick” applications.

    The cloud limits performance because the applications must be accessed over the WAN. WANs are different from LANs in three ways – WAN bandwidth is a fraction of LAN bandwidth, WAN latency is orders of magnitude higher than LAN latency, and packet loss exists on the WAN where none existed on the LAN. Most IT professionals are familiar with the impacts of bandwidth on transfer times – a 100MB file takes approximately 1 second to transfer on a Gbps LAN and approximately 10 seconds to transfer on a 100Mbps LAN. They then extrapolate this thinking to the WAN and assume that it will take 10 seconds to transfer the same file on a 100Mbps WAN. Unfortunately, this isn’t the case. Introduce 100ms of latency and this transfer now takes almost 3 minutes. Introduce just 1 % packet loss and this transfer now takes over 10 minutes.

    There’s a calculator available that will let you figure out the effective throughput of your own WAN if you know its bandwidth, latency, and loss. Once you know your effective throughput simply divide 800Mb (100MB) by your effective throughput to determine how long it would take to transfer the same example file over your WAN.

    Latency and loss don’t just impact file transfer times, they also have a dramatic impact on any applications that need to be accessed in real-time over the WAN. In this context a real-time application is one that requires real-time response to users’ keystrokes – think of any application that is served over a thin-client infrastructure or Virtual Desktop Infrastructure (VDI). Not only is the server 100 ms away but any lost packet will result in delays of up to half a second waiting for the loss to be detected and the retransmission to occur. This is the root cause of the frustrated user banging on the enter key looking for a response.
    This all seems like a lot effort, doesn't it? Why not just dump TCP and move to a better protocol? Sounds good but everything works on TCP so to change now would be monumental. And as strange as it seems TCP is doing it's job. It's a protocol so there's a separation of what's above it from what's below it which allows innovation at the TCP level without breaking anything. And that's what layering is all about.

    The upshot is with a little planning you can take advantage of much cheaper IP VPN costs, improve latency, and maximize bandwidth usage. Just like the big guys.

    Related Articles

  • Cloud Computing Requires Infrastructure 2.0 by Gregory Ness
  • Myth of Bandwidth and Application Performance by Ameet Dhillon
  • How Does WAN Optimization Work? by Paul Rubens
  • SilverPeak Technology Overview
  • Reader Comments (5)

    According to Silver Peak's calculator you'd need one of their accelerators to get much beyond 100Mbps on a LAN! I think the flaw here is that they assume your endpoints do not support TCP window scaling. That's great for pushing expensive products but does not reflect the real world.

    November 29, 1990 | Unregistered CommenterAnonymous

    The formula used for the calculator is well-documented in the literature. If the webpage would let you plug in less than 5 ms latency you'd get full LAN throughput. That said, the poster is correct that the formula assumes a standard TCP window. In the real-world admins can't configure and maintain thousands of hosts and servers with widely varying OSes, revisions, and patches for window scaling. Also, full-featured TCP acceleration does more than window scaling - it also includes features like selective acknowledgment, accurate round-trip measurement, and high-speed TCP that increase throughput in the real-world. Finally, window scaling alone won't help you when there's loss - you need Forward Error Correction. And, of course, none of this discussion includes any of the other features that a full-featured WAN optimization product brings (deduplication, packet order correction, QoS, coalescing, forward error correction, support for protocols other than TCP).

    November 29, 1990 | Unregistered CommenterDamon Ennis

    In case anyone is looking for a real product that can do this today, CohesiveFT VPN-Cubed allows you to link multiple clouds and datacenters through cheap VPNs and as of last month supports IPsec interop with Cisco and other vendors.


    November 29, 1990 | Unregistered CommenterDmitriy

    This article is a bit confusing. Is this about being able to link Clouds together using IP VPN i.e. over the internet with WAN/LAN like characteristics? and then saying because this is possible it is reasonable to throw away WAN links between private data centres and replace with IP VPN? What type of WAN solutions are you talking about?

    It isn't clear how it has been concluded that IP VPN over the internet is cheaper, just because some bit companies are doing it?? Are we talking metro, national, or international ? What kind of bandwidth? at least at national level i do not think this IP VPN is a sensible option to replace a WAN link. At least in the UK MPLS/VPLS type WAN solutions are actually a lot cheaper than "top tier" internet bandwidth. Install costs are about the same per end, extra nodes can be brought in the same way as IP VPN, and bandwidth is more than 50% cheaper - IP VPN needs a bunch of not so cheap stuff on top of this. Unless you are talking about a complete outsourced IP VPN solution then the modern MPLS/VPLS based WAN is a much simpler setup (low admin overhead) no VPN devices, no WAN optimiser/accelerator etc.

    So if we are talking about cloud then by today's definitions a public cloud is only connected to the internet so IP VPN is the only way. Depending on what type of WAN solution you refer to this may also hold true, older point-to-point solutions might be more costly than modern Ethernet based solutions. Also depends on what countries you are talking about as cost is based on maturity and the current stage of the comms market.

    WAN optimization kit have a place but a lot of the time used by network people as a bolt on fix to problems in software. And this is not always the smartest approach.

    November 29, 1990 | Unregistered CommenterJA

    The main point is that as enterprises move away from (expensive) private line WAN links to cheaper bandwidth (MPLS, VPLS, or Internet) they need to contend with the impact of loss on their applications. Generally speaking private lines have no loss and all "shared" WAN technologies (MPLS, VPLS, or Internet VPN) have some non-zero loss. Even a very small amount of loss (as low as 0.1%) has a significant impact on per-flow throughput and even more impact on "real-time" applications such as Citrix, Voice, and Video for which retransmission is an ineffective error recovery mechansim. The impact of loss compounds as latency increases.

    When enterprises rely on "The Cloud" to host applications they are almost always accessing the application over the Internet so loss and latency can be significant and very detrimental to application performance.

    So, as enterprises move to more cost effective WAN technologies loss and latency combine to make applications unusable. WAN optimization is a solution to these loss and latency problems (and also supplies a bandwidth multiplier too)

    November 29, 1990 | Unregistered Commenterdamonennis

    PostPost a New Comment

    Enter your information below to add a new comment.
    Author Email (optional):
    Author URL (optional):
    Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>