« Build your own twitter like real time analytics - a step by step guide | Main | Sponsored Post: Torbit, Infragistics, Velocity, Reality Check Network, Gigaspaces, AiCache, Logic Monitor, Attribution Modeling, New Relic, AppDynamics, CloudSigma, ManageEnine, Site24x7 »

Averages, web performance data, and how your analytics product is lying to you  

This guest post is written by Josh Fraser, co-founder and CEO of Torbit. Torbit creates tools for measuring, analyzing and optimizing web performance.  

Did you know that 5% of the pageviews on Walmart.com take over 20 seconds to load? Walmart discovered this recently after adding real user measurement (RUM) to analyze their web performance for every single visitor to their site. Walmart used JavaScript to measure their median load time as well as key metrics like their 95th percentile. While 20 seconds is a long time to wait for a website to load, the Walmart story is actually not that uncommon. Remember, this is the worst 5% of their pageviews, not the typical experience.

Walmart's median load time was reported at around 4 seconds, meaning half of their visitors loaded Walmart.com faster than 4 seconds and the other half took longer than 4 seconds to load. Using this knowledge, Walmart was prepared to act. By reducing page load times by even one second, Walmart found that they would increase conversions by up to 2%.

The Walmart case-study highlights how important it is to use RUM and look beyond averages if you want an accurate depiction of what's happening on your site. Unlike synthetic tests which load your website from random locations around the world, RUM allows you to collect real data from your actual visitors. If Walmart hadn't added RUM, and started tracking their 95th percentile, they may have never known about the performance issues that were costing them some of their customers. After all, nearly every performance analytics product on the market just gives you an average loading time. If you only look at Walmart's average loading time of 7 seconds it's not that bad, right? But as you just read, averages don't tell the whole story.

There are three ways to measure the central tendency of any data set: the average (or mean), median, and the mode - in this post we're only going to focus on the first two. We're also going to focus on percentiles, all of which are reported for you in our real user measurement tool.

It may have been some time since you dealt with these terms so here's a little refresher:

  • Average (mean): The sum of every data value in your set, divided by the total number of data points in that set. Skewed data or outliers may exist and pull the average away from the center, which could lead you to make wrongful interpretations.
  • Median: If you lined up each value in a data set in ascending order, the median is the single value in the exact middle. In page speed analytics, using the median gives you a more accurate representation of page load times for your visitors since it's not influenced by skewed data or outliers. The median represents a load time where 50% of your visitors load the page faster than the median value and 50% load the page slower than that value.
  • Percentiles: Percentiles are the 100 groups that fall under the full spectrum of your data. Usually, we hear, "You're in the 90th percentile," which means that your data is better than 90 percent of the data in question. In real user measurement, the 90th percentile represents a time value, and 90 percent of your audience loading at that value or faster. Percentiles show you a time value that you can expect some percentage of your visitors to beat in their load times.


Example histogram showing the log-normal distribution of loading times

Look at this example histogram showing the loading times for one of our customers. If you've studied probability theory, you may recognize this as a log-normal distribution. This means the distribution is the multiplicative product of multiple independent random variables. When dealing with performance data, a histogram is one of your most helpful visualizations.

In this example, other products that only report the average load time would show that their visitors load the site in 5.76 seconds. While the average page load is 5.76 seconds, the median load time is 3.52 seconds. Over half of visitors load the site faster than 5.76 seconds, but you'd never know that just looking at averages. Additionally, the 90th percentile here is over 11 seconds! Most people are experiencing load times faster than that, but of course, that 10% still matters.

For people who care about performance, it's important to use a RUM product that gives you a complete view into what's going on. You should be able to see a histogram of the loading times for every visitor to your site. You should be able to see your median load time, your 99th percentile and lots of other key metrics that are far more actionable than just looking at an average.

For any business making money online, you know that every visitor matters. For most sites, it's not acceptable for 10% of your visitors to have a terrible experience. Those 10% were potential customers that you lost, perhaps for good, simply because your performance wasn't as great as it should have been. But how do you quantify that?

It all begins with real user measurement.

If you want to accurately measure the speed on your site, it's important to include RUM in your tool belt. Neither synthetic tests nor averages tell the full story. Without RUM, you're missing out on important customer experience data that really matters for your business.

Related Articles

Reader Comments (3)

How can you be sure that this problem is in walmart's system? How much of this 5% is crappy wi-fi connection or other glitches that walmart can't control?

May 24, 2012 | Unregistered CommenterAngus

Great article, well explained. Good insights on how to to analyze performance metrics to gauge the application performance.

I believe the following sentence has a typo. kindly ignore if my understanding is wrong.
"While the average page load is 5.76 seconds, the median load time is 3.52 seconds. Over half of visitors load the site faster than 5.76 seconds, but you'd never know that just looking at averages"

It should be read as
"While the average page load is 5.76 seconds, the median load time is 3.52 seconds. Over half of visitors load the site faster than 3.52 seconds, but you'd never know that just looking at averages"

May 24, 2012 | Unregistered CommenterBarath K

Nice article, thank you.

It reminds me of the great talk by John Rauser called Look at Your Data.

@Barath K:
While you may be right that there is a mistake, do consider that the original statement quite makes sense.

The author's sentence (and the idea of the article in general) probably means that looking at just the average of 5.76s may trick us into thinking that this is the actual load time of all users but it isn't. Also it could trick us into making the wrong conclusions. That's just an average with all the cons an average has, as explained in this article and the talk by John Rauser :)

The median here tells us that ~50% of the users are subject to slower load times than 3.52s, while the other ~50% of the users experience load times faster that that. Which in combination to the known average would also probably mean that "Over half of visitors ..." (e.g. could be 70%) load the page in under 5.76s. All of which is something "you'd never know just looking at averages". The value of the median in addition to knowing the average gives us a much better understanding of the whole picture.

Or I could be wrong in which case please ignore my comment :)


May 26, 2012 | Unregistered CommenterRadko Dinev

PostPost a New Comment

Enter your information below to add a new comment.
Author Email (optional):
Author URL (optional):
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>