performance measurement

A non-geeky guide to understanding performance measurement terms

In our industry, there’s a lot of language around how we time website speed. We tend to assume that outsiders understand our language, but something I read recently indicates that the average person doesn’t. We need to fix that.

A couple of weeks ago, I came across this article written by Luxury Daily writer Rachel Lamb: Luxury marketers dramatically drop site loading times. Given our own research into web performance and luxury markets here at Strangeloop, my curiosity was piqued.

The article made this statement, based on a recent report by SmartBear:

The average luxury site’s load time went from 2.6281 seconds in the third quarter to 1.321 seconds in the fourth quarter of 2011.

Looking at the results made me laugh and cry:

Home page Load/response time
(as cited in article)
Rolls-Royce 0.169
Porsche (US) 0.256
Jaguar (US) 0.260
Mercedes-Benz (US) 3.405
Ferrari (US) 4.585
Infiniti (US) 4.154
Prada (US) 0.170
Cartier 0.244
Calvin Klein 3.742
Burberry (US) 3.548
Hugo Boss (US) 0.658

Given that the ideal load time is 2 seconds or less, this is a good-looking set of numbers — too good-looking. I did a little research of my own via WebPagetest.* The three new columns are mine.

Home page Load/response time
(as cited in article)
First byte Start render Load time
Rolls-Royce 0.169 0.788 2.712 13.339
Porsche (US) 0.256 0.187 0.766 4.760
Jaguar (US) 0.260 0.538 1.098 10.722
Mercedes-Benz (US) 3.405 0.380 2.274 9.507
Ferrari (US) 4.585 0.483 1.409 2.831
Infiniti (US) 4.154 0.909 2.849 15.528
Prada (US) 0.170 0.723 10.800 12.142
Cartier 0.244 0.175 0.740 1.340
Calvin Klein 3.742 0.316 0.540 0.786
Burberry (US) 3.548 0.408 2.020 6.364
Hugo Boss (US) 0.658 0.353 3.593 6.425

What do these numbers mean?

If you’re a relative newcomer to the performance scene and the tables above looks like numerical gibberish, it’s not your fault. I’ll get into the terminology later in this post. For now, suffice to say that there’s a lot of variance in these numbers.

So, you have response time, time to first byte, start render time, and load time. Which set of numbers do you rely on to answer the #1 question site owners ask:

How fast does my site load for real users?

The short answer is: None of them, totally.

The long answer is: It’s complicated. Keep reading.

If you can’t trust numbers, what can you trust?

Numbers in a spreadsheet are a good way to spot larger patterns and trends, but if you want to get a ground-zero look at your site’s performance, capturing videos and filmstrip views of your pages’ load times are one of the best ways to go.

To illustrate, let’s take a closer look at two of the top-performing sites, according to this article: Prada and Rolls-Royce.

Remember that, in the luxury website performance article, Prada was lauded as having one of the fastest sites, with a response time of 0.17 seconds? While the response time may have been quick, there’s a serious problem at the network level. If you view Prada’s page load as a filmstrip (a nifty WebPagetest feature that I don’t think gets talked about enough), you see that, from a user’s perspective, nothing happens on the page until around 10.5 seconds, which roughly correlates to the start render time of 10.8 seconds.

You can also output the filmstrip to a video:

If you view the filmstrip for Rolls-Royce, you see that nothing starts to happen until around 3 seconds, again correlating roughly to the start render time of 2.712 seconds. This might sound acceptable on paper, but note that the feature banner doesn’t load till after the 11-second mark. An eyetracking study by usability expert Jakob Nielsen found that delaying banner load by 8 seconds resulted in the banner being virtually ignored when it finally showed up.

Now let’s watch it as a video:

So how do we make sense of the fact that, according to the Luxury Daily article, the Roll-Royce website had a response time of 0.169 seconds, while my tests showed the page didn’t start to show up in the browser until almost 3 seconds, and didn’t fully load until more than 13 seconds had passed? We need to define our terms.

Four key performance measurement terms explained (so that normal people can understand them)

First, I want to be straight about the fact that I don’t think SmartBear was trying to mislead anyone with their numbers. Without knowing how SmartBear defined “response time” in their tests, it’s impossible to comment on their results. Because of this vagueness, I think Ms. Lamb has made two understandable mistakes — mistakes I encounter frequently when I talk about performance outside the geek zone:

  • Using “response time” and “load time” interchangeably.
  • Not realizing that “response time” can mean any number of completely different things.

There isn’t a lot of effort to educate the lay public — such as journalists, and even customers — about what these terms mean. To address this problem, here’s a simple guide to understanding fundamental website performance measurement terms, and when and why you should care about each.

Response time

What it means: Response time is incredibly tricky, and it causes a lot of the confusion I encounter. It can refer to any number of things, depending on whom you ask: server-side response time, end-user response time, HTML response time, time to last byte with no bandwidth/latency, and on and on. Long story short: There’s no single definition.

Caveats: If someone starts talking to you about response time, ask them to clarify which response time they mean. Be wary of anyone who tries to sell you on the idea that there’s only one definition. If user experience matters to you, ask how whatever type of response time you’re looking at relates to what the end user actually sees.

When it’s useful: Different types of response time measurements tell you different things, from the health of your back end to when content starts to populate the browser. As I’ve already said — and it bears repeating — you need to know what you’re measuring and why.

Time to first byte

What it means: Time to first byte is measured from the time the request is made to the host server to the time the first byte of the response is received by the browser.

Caveats: Time to first byte doesn’t really mean anything when it comes to understanding the user experience, because the user still isn’t seeing anything in the browser.

When it’s useful: For detecting back-end problems. If your website’s time to first byte is more than 100 milliseconds or so, it means you have back-end issues that need to be examined. (Web performance consultant Andrew King has written a good post about this, as has Google performance expert Pat Meenan.)

Start render

What it means: As its name suggests, “start render” indicates when content begins to display in the user’s browser. This term seems to have evolved as an alternative to “end-user response time”, but it’s not yet widely used outside of hardcore performance circles.

Caveats: Doesn’t indicate whether the first content to populate the browser is useful or important, or simply ads and widgets.

When it’s useful: When measuring large batches of pages, or performance of the same page over time, it’s good to keep an eye on this number. Ideally, visitors should start seeing usable content within 2 seconds. If your start render times are higher than this, you need to take a closer look.

Load time

What it means: The time it takes for all page resources to render in the browser — from those you can see, such as text and images, to those you can’t, such as third-party analytics scripts. (Geek version: “Load time” is also known as “document complete time” or “onLoad time”. It’s measured when the browser fires something called an “onLoad event” after all the page resources have fully loaded. No matter what you call it, it’s used as a primary measuring stick for site performance.)

Caveats: Needs to be taken with a grain of salt, because it isn’t an indicator of when a site begins to be interactive. A site with a load time of 10 seconds can be almost fully interactive in the first 5 seconds. That’s because load time can be inflated by third-party scripts, such as analytics, which users can’t even see.

When it’s useful: Load time is handy when measuring and analyzing large batches of websites, because it can give you a sense of larger performance trends.

Three things to remember:

  1. There’s no single “right” way to measure performance. Each measurement tells you something meaningful about how your site performs.
  2. You need to understand the different performance measurement terms so that you can interpret your own data. If you don’t, sad to say some people will take advantage of your ignorance to mislead you for their own benefit. (For example, it’s a little-known fact that some performance vendors have convinced site owners to tie bonuses for key employees to backbone test results, which do not measure real-world performance.)
  3. As a matter of due course, you always need to gather large batches of data and rely on median numbers. But you also need to periodically get under the hood and take a real-world look at how your pages behave for real users.

*WebPagetest is a third-party tool that simulates how fast a site loads for real-world users using a variety of browsers. In this set of tests, I looked at how fast each site would load for a person using Internet Explorer 8 over a DSL connection via the WebPagetest server in Dulles, VA.

Related posts:

Advanced Mobile Optimization: How does it work? How do we measure success? [slides]

It’s been a busy couple of weeks, but I finally got around to posting the slides from my talk about advanced mobile optimization at the San Francisco & Silicon Valley Web Performance Meetup.

I always enjoy coming to these Meetups, and this time was no exception. Thanks again to Aaron Kulick for inviting me, to LinkedIn for hosting, and to the extremely keen and knowledgeable crowd who turned out. :)

Related posts:

Your 10 favorite posts of 2011

A couple of days ago, I said that it wouldn’t be December without a set of predictions. But it really wouldn’t be December without a roundup of the most-read posts on this site.

1. Early findings: 97% of mobile end-user response time happens at the front end

I revisited Steve Souders’s four-year-old stat that says that 80% of end-user response time occurs at the front end, and made a surprising discovery: After analyzing beacon data from 5 million Strangeloop customer transactions, I found that the front end is where a whopping 97% of mobile response time happens.

2. How to perform a 5-minute page speed/revenue analysis of your e-commerce site

I converted a performance non-believer, first by showing him that his site was 30% faster in IE8 than in IE7, and then by pointing out that the value per visitor on his site was 29% higher for IE8 than it was for IE7. Using two simple tools you probably already have at hand, you can quickly calculate how a faster user experience correlates to greater order value on your own website. (We later used this post as the basis for a short webinar, which you can watch here.)

3. The 12 most-asked questions about how Google factors page speed into its search rankings

It’s a well-known fact that site speed is a critical ranking factor for organic search. One of the most-asked questions I receive is: How exactly does Google do this? Over the last year and a bit, I’ve done quite a bit of digging to get the answers. I thought it would be useful to start an FAQ-style repository for the answers.

4. Automating complexity: The future of website performance optimization

Applying performance best practices in a general sense will take care of 80% of front-end web performance problems, but the last crucial 20% can only be achieved through painful real world testing and iterative problem solving. We need to find a way to do this quickly and cost-effectively. Back in January, this was my vision.

5. Google’s new Page Speed service: A handy resource for smaller site owners

When Google announced their Page Speed service in July, the most frequent question fielded was, “Is the Page Speed service a threat?” In short, no. If anything, it offers yet more validation that site speed is a crucial business issue.

6. Fourth-party calls: What you don’t know can hurt your site… and your visitors

There’s a growing awareness of the fact that third-party content can cause a major hit to your website’s performance. Good. Great. Now we need to tackle what I’ve dubbed “fourth-party calls”. Not only can these insidious server calls leach performance, they also have massive security implications.

7. Slow websites make people angry

Aberdeen Group has reported that “A one-second delay in page load time equals a 16% decrease in customer satisfaction.” But what does that customer dissatisfaction look like in the real world? I searched Twitter to find out. It wasn’t pretty.

8. Front-end optimization: It isn’t over till it’s over. And it’s never over.

A concise example illustrating three important things about front-end optimization (FEO): the current performance rules are not complete; these performance rules will never be static; and the front-end optimization market is evolving faster than the current performance tools can measure.

9. This is your brain on a slow website: Lab experiments quantify “web stress”

Fascinating study: Brain wave analysis reveals that people have to concentrate up to 50% more when using badly performing websites. EOG technology and behavioral analysis also reveal greater agitation and stress in these periods.

10. Why the performance measurement island you trust is sinking

I routinely encounter customers that have been led, by the very experts they trust, into believing that their site performance can be measured by the wrong tools. This post was written to explain exactly why you can’t always believe the experts.

This is my last post of 2011. Before I sign off for the year, I want to take a moment to thank you for coming to this site, for reading, and for your thoughtful comments. It’s a privilege to write for such an engaged community at such an exciting time in our industry. I’m looking forward to even more exciting times ahead.

Related posts:

Revisiting the performance equation

Way back in the good old days — that would be circa 2007, of course — we talked a lot about the performance equation here at Strangeloop. I came across this venerable equation today as I was going through some old reports, and I wanted to look at it again.

In September 2006, Peter Sevcik and Rebecca Wetzel of NetForecast published a paper called “Field Guide to Application Delivery Systems” (available for free by request here). The paper focused on improving wide area network (WAN) application performance and included the following equation:

While this equation looks at WAN performance, we wanted to take it and use it as the basis for a web application performance equation. So in 2007, we created this equation, along with our now rather dated definitions (note the IE 6 example), below:

Variable Definition
R Response time. The total time from the user requesting a page (by clicking a link, and so on) to when the full page is rendered on the user’s computer. Typically measured in seconds.
Payload Total bytes sent to the browser, including markup and all resources (such as CSS, JS, and image files).
Bandwidth Rate of transfer to and from the browser. This may be asymmetrical and might represent multiple speeds if a given page is generated from multiple sources. Usually, it is averaged together to create a single bandwidth expressed in bytes per second.
AppTurns The number of resource files a given page needs. These resource files will include CSS, JS, images, and any other files retrieved by the browser in the process of rendering the page. In the equation, the HTML page is accounted for separately by adding in round-trip time (RTT) before the AppTurns expression.
RTT The time it takes to round-trip, regardless of bytes transferred. Every request pays a minimum of one RTT for the page itself. Typically measured in milliseconds.
Concurrent requests Number of simultaneous requests a browser will make for resource files. By default, Internet Explorer 6 performs two concurrent requests. This setting can be adjusted but rarely is.
Cs Compute time on the server. This is the time it takes for code to run, retrieve data from the database, and compose the response to be sent to the browser. Measured in milliseconds.
Cc Compute time on the client. This is the time it takes for a browser to actually render the HTML on the screen, execute JavaScript, implement CSS rules, and so on.

Why I want to put this equation back on the table

What’s great about this equation, and the reason I want to revive it for conversation, is that it represents many of the areas where we add value. Equations like this are really useful to point out how you affect load time (i.e. using feature X makes payload smaller and thus reduces page load time).

Another reason for resuscitating this performance equation is because some people have a hard time visualizing waterfalls, and I have found that an equation helps.

In an ideal world, I would love to put up a slide describing our technology, which looks something like this:

But the reality is that this is much more complicated than I wished for.

Exercise: How does the equation stack up next to a waterfall?

I am no math genius, but when I started thinking about the equation in terms of the modern waterfall, I realized quickly that either we were totally misguided, completely ignorant, or the world has changed dramatically.

I started the exercise by looking at a waterfall. In order see how useful this is, I randomly selected the top test in the test history on WebPagetest. It happened to be Yahoo.

Very quickly, many unanswered questions started cropping up. For example: Given how different upload and download bandwidth are, do I need to divide these out and consider them differently?

So I made a small edit to the equation to take this into consideration:

Then I looked at the number of different domains and the fact that they all have different RTT times and different levels of concurrency. So I edited the equation again:

And another question: Where is connection and DNS time taken into consideration? If some roundtrips start new connections and/or resolve DNS, then that needs to be taken into consideration. So I edited the equation again:

Then things just got stupid as I tried to answer many more unanswered questions, such as:

  • How do I guess at a concurrency constant when concurrency is so varied within the waterfall?
  • How do I calculate Cc?
  • How do I capture blocking scripts?

The questions just kept coming.

After trying to take all of these items into consideration, I wound up with a big mess and abandoned the project altogether. My observations were as follows:

  • Our world is so complicated that a formula can’t be developed to synthetically deduce performance.
  • Formulas like this are descriptive in nature and should remain squarely in the hands of the marketing department.

Anyone know of anything that easily describes page load time? I’m willing to ship a nice case of Canadian beer to anyone who comes up with something representative and at least somewhat defensible.

Related posts:

Lots of news on the Strangeloop front

It’s been a busy week, even by Strangeloop standards:

Monday: We announced that we’ve successfully integrated Google’s SPDY protocol into our Site Optimizer appliance and service, making Site Optimizer the world’s first commercial product to offer site owners the opportunity to implement SPDY automatically on their sites. For background, our official press release is here, but I also really like this write-up by Steven Vaughan-Nichols at ZDNet and this one by Erica Naone at MIT’s Technology Review.

Tuesday: We announced that we’re partnering with Neustar Webmetrics to combine their ability to put their finger on precisely where performance pains happen with our ability to fix those pains. Neustar has a fantastic team, and we’re really looking forward to working with them.

Wednesday: Steve Souders announced at Velocity that Strangeloop is one of the sponsors of the HTTP Archive as it grows to support one million of the world’s leading websites. The archive promises to be an invaluable resource to those of us who care about tracking changes in how sites are built and delivered. (Case in point: Did you know that in just six months — from November 2010 to May 2011 — the average site grew by 8%, from 658k to 711k? That’s a pretty huge change for such a short time span! And we know about this change thanks to the archive.)

And the excitement isn’t over — now everyone’s at Velocity. (Including me, finally! My flight got delayed in Denver, and I just arrived.) While I’ve missed a chunk of the proceedings, there are still a lot of great sessions to look forward to, and great people to catch up with. If you’re here, say hi.

Related posts: