Performance in the Cloud: Business Jitter is Bad

biz jitter

One  of the benefits of web applications is that they are generally  transported via TCP, which is a connection-oriented protocol designed to  assure delivery. TCP has a variety of native mechanisms through which  delivery issues can be addressed – from window sizes to selective acks  to idle time specification to ramp up parameters. All these technical  knobs and buttons serve as a way for operators and administrators to  tweak the protocol, often at run time, to ensure the exchange of  requests and responses upon which web applications rely. This is unlike  UDP, which is more of a “fire and forget” protocol in which the server  doesn’t really care if you receive the data or not.

Now, voice  and streaming video and audio over the web has always leveraged UDP and  thus it has always been highly sensitive to jitter. Jitter is, without  getting into layer one (physical) jargon, an undesirable delay in the  otherwise consistent delivery of packets. It causes the delay of and  sometimes outright loss of packets that are experienced by users as  pauses, skips, or jumps in multi-media content.

While the same  root causes of delay – network congestion, routing changes, time out  intervals – have an impact on TCP, it generally only delays the  communication and other than an uncomfortable wait for the user, does  not negatively impact the content itself. The content is eventually  delivered because TCP guarantees that, UDP does not.

However,  this does not mean that there are no negative impacts (other than trying  the patience of users) from the performance issues that may plague web  applications and particularly those that are more and more often out  there, in the nebulous “cloud”. Delays are effectively business jitter  and have a real impact on the ability of the business to perform its  critical functions – and that includes generating revenue.


David  Linthicum summed up the issue with performance of cloud-based  applications well and actually used the terminology “jitter” to describe  the unpredictable pattern of delay:

Are  cloud services slow? Or fast? Both, it turns out -- and that reality  could cause unexpected problems if you rely on public clouds for part of  your IT services and infrastructure.
When  I log performance on cloud-based processes -- some that are I/O  intensive, some that are not -- I get results that vary randomly  throughout the day. In fact, they appear to have the pattern of a very  jittery process. Clearly, the program or system is struggling to obtain  virtual resources that, in turn, struggle to obtain physical resources.  Also, I suspect this "jitter" is not at all random, but based on the  number of other processes or users sharing the same resources at that  time.
-- David Linthicum, “Face the facts: Cloud performance isn't always stable

But  what the multitude of articles coming out over the past year or so with  respect to performance of cloud services has largely ignored is the  very real and often measurable impact on business processes. That jitter  that occurs at the protocol and application layers trickles up to  become jitter in the business process; a process that may be critical to  servicing customers (and thus impacts satisfaction and brand) as well  as on the bottom line. Unhappy customers forced to wait for “slow  computers”, as it is so often called by the technically less adept  customer service representatives employed by many organizations, may  take to the social media airwaves to express displeasure, or cancel an  order, or simply refuse to do business in the future with the  organization based on delays experienced because of unpredictable cloud  performance.

Business jitter can also manifest as decreased business productivity measures, which it turns out can be measured mathematically if you put your mind to it.

Understanding the variability of cloud performance is important for two reasons:

  1. You  need to understand the impact on the business and quantify it before  embarking on any cloud initiative so it can be factored in to the  overall cost-benefit analysis. It may be that the cost savings from  public cloud are much greater than the potential loss of revenue and/or  productivity, and thus the benefits of a cloud-based solution outweigh  the risks.
  2. Understanding the variability and from where it  comes will have an impact and help guide you to choosing not only the  right provider, but the right solutions that may be able to normalize or  mitigate the variability. If the primary source of business jitter is  your WAN, for example, then it may be that choosing a provider that  supports your ability to deploy WAN optimization solutions would be an  appropriate strategy. Similarly
cloud performance battle

, if the variability in performance stems from capacity issues, then choosing a provider that allows greater latitude in load balancing algorithms or the deployment of a virtual (soft) ADC would likely be the best strategy.

It  seems clear from testing and empirical (as well as anecdotal) evidence  that cloud performance is highly variable and, as David puts it,  unstable. This should not necessarily be seen as a deterrent to adopting  cloud services – unless your business is so highly sensitive to latency  that even milliseconds can be financially damaging – but rather it  should be a reality that factors into your decision making process with  respect to your choice of provider and the architecture of the solution  you’ll be deploying (or subscribing to, in the case of SaaS) in the  cloud.

Knowing is half the battle to leveraging cloud successfully. The other half is strategy and architecture.