28 Nov 2012
This is the final post in a series in which I’ve been addressing a question that was put to me by a customer a few months ago:
“Aren’t many of the web performance rules described by Steve Souders in 2007 already outdated or made obsolete by browser innovation?”
- Avoid redirects
- Remove duplicate scripts
- Configure ETags
- Make AJAX cacheable
1. As with my previous post, I used the test cases of the performance rules that Steve created five years ago, which are still available. In my earlier posts, I timestamped them to October 2007 using the Wayback Machine.
2. Next, I ran each test case on Internet Explorer 6 on WebPagetest to get a sense of what Steve would have seen for performance rules 11 through 14. I did three runs of each test and used the median result.
3. Then I ran each test case on Chrome 23 to see what impact, if any, the rules have on a modern browser. Again, I performed three runs of each test and recorded the median result.
4. Using the above approach, I was able to see and compare the before-and-after results for each rule for both browsers, and then calculate the benefit.
Reminder: As I stated in my earlier posts, the goal here is not to look at how fast Chrome 19 is versus Internet Explorer 6. We have looked at this in the past. This time I want to look at the relative benefit of each performance rule.
As it turned out, there were no test cases for three out of four of these rules, for reasons I’ll explain below. Here’s a high-level look at the results:
Rule 11: Avoid redirects
What this rule means: In broad terms, a redirect is a permanent or temporary redirect from one URL to the other. A permanent redirect is response code 301. There are multiple temporary ones, but the response code most commonly used to describe a temporary redirect is 302. There are several reasons why sites use redirects, such as fixing missing trailing slashes, connecting websites, and internal/outbound tracking, to name just a few.
Fixing missing trailing slashes
This is a really common dev mistake. As Steve says, “One of the most wasteful redirects happens frequently and web developers are generally not aware of it. It occurs when a trailing slash (/) is missing from a URL that should otherwise have one.”
For example, if the URL for your old website was www.goodsite.com and you wanted to change it to www.bettersite.com, you’d implement a 301 redirect from the old URL to the new one. Now whoever typed in your old URL (or clicked on a leftover link to your old URL) would automatically get taken to your new URL.
Redirects are a way to figure out where visitors are going next. When you’re about to leave a site (say, from a search result page), instead of just hyperlinking to the new site, the link takes you to a URL on the current site that then redirects you to the new site. That in-between URL tracks the fact that you’re leaving the site and where you’re going.
Browsers can’t fix redirect problems. This fix lies with site owners.
As Steve points out, “The main thing to remember is that redirects slow down the user experience. Inserting a redirect between the user and the HTML document delays everything in the page since nothing in the page can be rendered and no components can start being downloaded until the HTML document has arrived.”
As the rule states, avoid redirects. For example, don’t use redirects to keep track of clicks that leave your site. Instead, Steve offers two alternative techniques that send a non-blocking beacon as a visitor clicks away from the page, so the visitor doesn’t have to wait for the redirect before going to a new site.
Testing the rule: N/A
Rule 12: Remove duplicate scripts
What this rule means: It’s no surprise that duplicate scripts — which generate unnecessary HTTP requests and waste time evaluating the same script more than once — hurt performance. What is surprising is that this mistake happens so often. This issue is more likely to pop up when development teams are large and when pages contain a huge number of scripts.
As Steve said back in 2007, “Unnecessary HTTP requests happen in Internet Explorer, but not in Firefox. In Internet Explorer, if an external script is included twice and is not cacheable, it generates two HTTP requests during page loading. Even if the script is cacheable, extra HTTP requests occur when the user reloads the page.”
Testing the rule: Steve experimented with implementing caching on a page to test its ability to deal with duplicate scripts. As you can see in the table below, this had a significant impact on IE6. The impact on Chrome was negligible, probably because Chrome does better parallelism and is more aggressive with caching duplicate scripts, even with resources that were not coded to be cached.
I also tested different versions of Internet Explorer, in order to pinpoint the version that evolved to address the duplicate scripts problem. As you can see in the table below, this happened with IE8.
||Benefit in IE6||Benefit in IE7||Benefit in IE8||Benefit in
|Duplicate script – cached||20%||16%||1%||1%|
|Duplicate script – 10 cached||19%||15%||1%||1%|
Rule 13: Configure ETags
What this rule means: There’s no test case for this rule — because this is a server issue, not a front-end issue — but I still want to review it here in order to talk about why this rule is important.
An ETag (also known as entity tag) is a string that uniquely identifies a specific version of a page object. Web servers and browsers use the ETag to determine whether an object in the browser’s cache matches the object on the origin server. When implemented correctly, ETags can help performance for repeat visits because they provide a kind of shortcut that allows the browser to validate objects more quickly.
Sounds good, right? But there’s a catch. As Steve points out, “The problem with ETags is that they typically are constructed using attributes that make them unique to a specific server hosting a site. ETags won’t match when a browser gets the original component from one server and later tries to validate that component on a different server—a situation that is all too common on web sites that use a cluster of servers to handle requests. By default, both Apache and IIS embed data in the ETag that dramatically reduces the odds of the validity test succeeding on web sites with multiple servers.”
As Steve counsels, “If you’re not taking advantage of the flexible validation model that ETags provide, it’s better to just remove the ETag altogether. The Last-Modified header validates based on the component’s timestamp. And removing the ETag reduces the size of the HTTP headers in both the response and subsequent requests.”
To clarify: ETags are not bad things in and of themselves. They’re actually a more accurate way of performing invalidation. But, because of their preciseness, if you want to harness the power of ETags, you need to make sure the same ETag is deterministically generated across your servers for a specific resource. If you’re not going to do that, it’s best not to use them, since misconfigured ETags will hurt you.
Testing the rule: N/A
Rule 14: Make AJAX cacheable
What this rule means: No test case for this rule, this time because Steve counsels that you apply the same performance rules already discussed to your Ajax requests — particularly having a far future Expires header.
Testing the rule: N/A
Here’s a summary of everything I learned over the course of revisiting all 14 of Steve’s rules:
Most of the performance rules have stood the test of time.
As I stated in the earlier posts, it’s striking how similar the results are for many of the test cases. I had expected that the rules would still be relevant, but I had also expected that browser evolution would have resulted in a greater gap between the results for IE6 and Chrome. If you build web applications and you care about performance, Steve’s book should still be your bible.
Reducing roundtrips still matters.
It doesn’t matter how much better modern browsers are at rendering page objects; fewer calls to the server still make a huge difference. Seeing that adding an Expires header leads to a 59% improvement in Chrome 19 tells me this technique is still incredibly relevant.
A CDN helps in some situations, but not all.
A CDN is a must for many sites, but it’s not a standalone performance solution. Benefits will vary depending on which CDN you choose, as well as things like how your CDN stores content and how far their PoPs are from users.
Compression still helps. A lot.
Not only did Gzipping offer benefits in Chrome, it offered even greater benefits than in IE6. This is a really compelling finding. It shows that, despite the advances made in modern browsers, modern pages have also changed a lot — meaning that more than ever they can benefit from compression.
The location of stylesheets is important, but it depends on the page composition.
Start render really matters and the location of the stylesheets is critical.
“Avoid CSS expressions” is obsolete… for all the right reasons.
It was interesting to see how a rule — avoid CSS expressions — has become so entrenched over time that it’s no longer an issue.
Unless you’re developing for older versions of IE, you probably don’t need to worry about avoiding duplicate scripts.
Newer browsers take care of the duplicate script problem — possibly due to the mitigating impact of doing better parallelism and/or more aggressive caching. But be sure to know whether or not a significant portion of your traffic uses IE6 and 7. If they do, then you still need to apply this rule. If you don’t apply it, you could be missing out on an opportunity to make a 15-20% performance gain.
I had a great time with this set of posts and want to thank Steve, not just for pioneering the original set of rules, but for his support and advice as I worked my way through all my tests. Thanks, Steve!