best practices

Browser innovation and the 14 rules for faster loading websites: Revisiting Steve’s work (part 3)

This is the final post in a series in which I’ve been addressing a question that was put to me by a customer a few months ago:

“Aren’t many of the web performance rules described by Steve Souders in 2007 already outdated or made obsolete by browser innovation?”

You can check out parts 1 and 2, though it’s not necessary to read those first in order to understand today’s post. Today, I’m looking at the last four rules:

  • Avoid redirects
  • Remove duplicate scripts
  • Configure ETags
  • Make AJAX cacheable

Methodology

1. As with my previous post, I used the test cases of the performance rules that Steve created five years ago, which are still available. In my earlier posts, I  timestamped them to October 2007 using the Wayback Machine.

2. Next, I ran each test case on Internet Explorer 6 on WebPagetest to get a sense of what Steve would have seen for performance rules 11 through 14. I did three runs of each test and used the median result.

3. Then I ran each test case on Chrome 23 to see what impact, if any, the rules have on a modern browser. Again, I performed three runs of each test and recorded the median result.

4. Using the above approach, I was able to see and compare the before-and-after results for each rule for both browsers, and then calculate the benefit.

Reminder: As I stated in my earlier posts, the goal here is not to look at how fast Chrome 19 is versus Internet Explorer 6. We have looked at this in the past. This time I want to look at the relative benefit of each performance rule.

Findings

As it turned out, there were no test cases for three out of four of these rules, for reasons I’ll explain below. Here’s a high-level look at the results:

Rule 11: Avoid redirects

What this rule means: In broad terms, a redirect is a permanent or temporary redirect from one URL to the other. A permanent redirect is response code 301. There are multiple temporary ones, but the response code most commonly used to describe a temporary redirect is 302. There are several reasons why sites use redirects, such as fixing missing trailing slashes, connecting websites, and internal/outbound tracking, to name just a few.

Fixing missing trailing slashes

This is a really common dev mistake. As Steve says, “One of the most wasteful redirects happens frequently and web developers are generally not aware of it. It occurs when a trailing slash (/) is missing from a URL that should otherwise have one.”

Connecting websites

For example, if the URL for your old website was www.goodsite.com and you wanted to change it to www.bettersite.com, you’d implement a 301 redirect from the old URL to the new one. Now whoever typed in your old URL (or clicked on a leftover link to your old URL) would automatically get taken to your new URL.

Internal/outbound tracking

Redirects are a way to figure out where visitors are going next. When you’re about to leave a site (say, from a search result page), instead of just hyperlinking to the new site, the link takes you to a URL on the current site that then redirects you to the new site. That in-between URL tracks the fact that you’re leaving the site and where you’re going.

Browsers can’t fix redirect problems. This fix lies with site owners.

As Steve points out, “The main thing to remember is that redirects slow down the user experience. Inserting a redirect between the user and the HTML document delays everything in the page since nothing in the page can be rendered and no components can start being downloaded until the HTML document has arrived.”

As the rule states, avoid redirects. For example, don’t use redirects to keep track of clicks that leave your site. Instead, Steve offers two alternative techniques that send a non-blocking beacon as a visitor clicks away from the page, so the visitor doesn’t have to wait for the redirect before going to a new site.

Testing the rule: N/A

Rule 12: Remove duplicate scripts

What this rule means: It’s no surprise that duplicate scripts — which generate unnecessary HTTP requests and waste time evaluating the same script more than once — hurt performance. What is surprising is that this mistake happens so often. This issue is more likely to pop up when development teams are large and when pages contain a huge number of scripts.

As Steve said back in 2007, “Unnecessary HTTP requests happen in Internet Explorer, but not in Firefox. In Internet Explorer, if an external script is included twice and is not cacheable, it generates two HTTP requests during page loading. Even if the script is cacheable, extra HTTP requests occur when the user reloads the page.”

Testing the rule: Steve experimented with implementing caching on a page to test its ability to deal with duplicate scripts. As you can see in the table below, this had a significant impact on IE6. The impact on Chrome was negligible, probably because Chrome does better parallelism and is more aggressive with caching duplicate scripts, even with resources that were not coded to be cached.

I also tested different versions of Internet Explorer, in order to pinpoint the version that evolved to address the duplicate scripts problem. As you can see in the table below, this happened with IE8.

Test case
Benefit in IE6 Benefit in IE7 Benefit in IE8 Benefit in
Chrome 23
Duplicate script – cached 20% 16% 1% 1%
Duplicate script – 10 cached 19% 15% 1% 1%

Rule 13: Configure ETags

What this rule means: There’s no test case for this rule — because this is a server issue, not a front-end issue — but I still want to review it here in order to talk about why this rule is important.

An ETag (also known as entity tag) is a string that uniquely identifies a specific version of a page object. Web servers and browsers use the ETag to determine whether an object in the browser’s cache matches the object on the origin server. When implemented correctly, ETags can help performance for repeat visits because they provide a kind of shortcut that allows the browser to validate objects more quickly.

Sounds good, right? But there’s a catch. As Steve points out, “The problem with ETags is that they typically are constructed using attributes that make them unique to a specific server hosting a site. ETags won’t match when a browser gets the original component from one server and later tries to validate that component on a different server—a situation that is all too common on web sites that use a cluster of servers to handle requests. By default, both Apache and IIS embed data in the ETag that dramatically reduces the odds of the validity test succeeding on web sites with multiple servers.”

As Steve counsels, “If you’re not taking advantage of the flexible validation model that ETags provide, it’s better to just remove the ETag altogether. The Last-Modified header validates based on the component’s timestamp. And removing the ETag reduces the size of the HTTP headers in both the response and subsequent requests.”

To clarify: ETags are not bad things in and of themselves. They’re actually a more accurate way of performing invalidation. But, because of their preciseness, if you want to harness the power of ETags, you need to make sure the same ETag is deterministically generated across your servers for a specific resource. If you’re not going to do that, it’s best not to use them, since misconfigured ETags will hurt you.

Testing the rule: N/A

Rule 14: Make AJAX cacheable

What this rule means: No test case for this rule, this time because Steve counsels that you apply the same performance rules already discussed to your Ajax requests — particularly having a far future Expires header.

Testing the rule: N/A

Let’s recap.

Here’s a summary of everything I learned over the course of revisiting all 14 of Steve’s rules:

Most of the performance rules have stood the test of time.

As I stated in the earlier posts, it’s striking how similar the results are for many of the test cases. I had expected that the rules would still be relevant, but I had also expected that browser evolution would have resulted in a greater gap between the results for IE6 and Chrome. If you build web applications and you care about performance, Steve’s book should still be your bible.

Reducing roundtrips still matters.

It doesn’t matter how much better modern browsers are at rendering page objects; fewer calls to the server still make a huge difference. Seeing that adding an Expires header leads to a 59% improvement in Chrome 19 tells me this technique is still incredibly relevant.

A CDN helps in some situations, but not all.

A CDN is a must for many sites, but it’s not a standalone performance solution. Benefits will vary depending on which CDN you choose, as well as things like how your CDN stores content and how far their PoPs are from users.

Compression still helps. A lot.

Not only did Gzipping offer benefits in Chrome, it offered even greater benefits than in IE6. This is a really compelling finding. It shows that, despite the advances made in modern browsers, modern pages have also changed a lot — meaning that more than ever they can benefit from compression.

The location of stylesheets is important, but it depends on the page composition.

Start render really matters and the location of the stylesheets is critical.

“Avoid CSS expressions” is obsolete… for all the right reasons.

It was interesting to see how a rule — avoid CSS expressions — has become so entrenched over time that it’s no longer an issue.

Unless you’re developing for older versions of IE, you probably don’t need to worry about avoiding duplicate scripts.

Newer browsers take care of the duplicate script problem — possibly due to the mitigating impact of doing better parallelism and/or more aggressive caching. But be sure to know whether or not a significant portion of your traffic uses IE6 and 7. If they do, then you still need to apply this rule. If you don’t apply it, you could be missing out on an opportunity to make a 15-20% performance gain.

Conclusion

I had a great time with this set of posts and want to thank Steve, not just for pioneering the original set of rules, but for his support and advice as I worked my way through all my tests. Thanks, Steve!

Related posts:

Can swing voters be swung on web performance? If so, which candidate would take the election?

Last spring, I did a fun little exercise where I measured the load times of the campaign sites belonging to candidates in the Republican primary, and then plotted them on a graph to see how their speed correlated to their position in the polls. At the time, Romney was the front runner in the polls. He also had one of the fastest sites — though “fast” is a relative term here — at 9.3 seconds. Interestingly, Obama’s site was slower than all the GOP candidates’ sites, at 13.6 seconds.

A lot has happened since then. Articles have been written about mobile use among minority groups (who reportedly tend to vote Democrat), the candidates’ different mobile strategies (Romney’s mobile-specific site vs. Obama’s responsively site), and the role of social media in the election.

With election day around the corner, I thought it would be interesting to take a fresh look at the candidates’ pages on both desktop and mobile and see what comes up. As it turns out, there’s some good debate fodder in the results.

Approach

1. Desktop: Tested splash screens and home pages at barackobama.com and mittromney.com 10 times each on Internet Explorer 8 via WebPagetest’s servers in Dulles (VA), Asheville (NC), Miama (FL), Kansas City (MO), and Los Angeles (CA). Calculated median load times for each.

Splash screens:

Home pages:

2. Mobile: Tested splash screens and home pages at barackobama.com and m.mittromney.com 5 times each on iPhone 4 over 3G and Wifi via stopwatch. Cleared cache and cookies between each set of tests. Calculated median load times for each.

Splash screens:

Home pages:

Desktop results (load times, in seconds, on IE8)

Location barackobama.com – splash
mittromney.com – splash
Dulles, VA 6.414 8.354
Asheville, NC
6.233 8.888
Miami, FL 5.467
8.454
Kansas City, MO 5.315 9.300
Los Angeles, CA 5.290 9.483
median 5.467 8.888

Above, we see Obama’s splash page come out ahead, 38% faster than Romney’s.

Location barackobama.com – home
mittromney.com – home
Dulles, VA 12.398 15.010
Asheville, NC
n/a n/a
Miami, FL 12.951 11.744
Kansas City, MO 16.372 11.415
Los Angeles, CA 15.291 11.371
median 14.121 11.580

But Romney’s home page fully loads 20% faster than Obama’s.

Mobile results (load times, in seconds, using iPhone 4′s native Safari browser)

Connection barackobama.com – splash
m.mittromney.com – splash
Wifi 8.782 4.450
3G
10.171 14.854

Romney’s m.site splash page loads in about half the time it takes for Obama’s RWD page to load over Wifi, but takes 46% longer over 3G.

Connection barackobama.com – home
m.mittromney.com – home
Wifi 5.836 6.872
3G
21.309 21.719

Obama’s responsively designed home page just edges out Romney’s m.site over Wifi and 3G.

Finding 1: Both candidates’ pages are bulky, slow, and contain resources from many domains.

Each candidate’s home page is big. Obama’s contains around 250 resources (images, CSS/JavaScript, etc.), outweighing Romney’s still-considerable 79 resources. And both home pages weigh in at more than 2Mb in size – massive even by today’s standards, when a typical page is around 1Mb.

Not only are these pages big, they’re also grabbing resources from a lot of domains: 37 for Obama’s site and 31 for Romney’s. Each of these unique domains extracts a performance penalty, due to the fact that the user’s browser has to perform a DNS lookup for each one, kind of like how you use a phone book to find someone’s phone number using their first and last name. It typically takes 20-120 milliseconds for DNS to lookup the IP address for a given hostname. The browser can’t download anything from this hostname until the DNS lookup is completed. All those DNS lookups can add up, and that’s before you even factor in the transfer time for the actual file.

Grabbing the filmstrip view of each page load gives you a good idea of what huge payload combined with multiple domains do to performance.

First, check out Obama’s home page, which doesn’t substantially render until the 9-second mark:

Romney’s page loads the outer content first, leaving the banner call to action — ostensibly the most important content on the page, to load last, at around 12 seconds:

If you contrast this with the results in the table above, you’ll note that, while on paper Romney’s home page fully loads 20% faster than Obama’s, the Romney home page actually appears to take 3 seconds longer from a real user’s perspective. (This is a great reminder of the importance of getting down on the ground and watching how pages actually perform, and not just looking at numbers in a chart.)

Finding 2: Secure resources are a major performance roadblock at barackobama.com.

While Obama’s pages may load perceptually faster than Romney’s, they seem to be struggling against some performance impediments in the form of a heap of mostly unnecessary SSL resources.

Below is a waterfall chart for Obama’s splash page, with each row representing a unique page element. If you’re not familiar with waterfall charts, I recommend this layperson-friendly primer, but for our purposes today, it’s enough to know that the purple bars all represent SSL negotiation. The longer the purple bar, the more time it takes for this negotiation to take place.

The waterfall for Obama splash screen is filled with SSL resources:

Romney’s waterfall, not so much:

SSL can have a serious impact on load times. By my estimate, it’s adding about 2 seconds to Obama’s page, which is a major performance hit. And this is just the splash screen.

I’ve talked before about the impact of secure content on performance. If your pages absolutely require SSL resources, then you can be smart about it by re-using TCP connections and creating longer-lasting SSL connections. Alternatively, you can offload SSL processing to another device, such as a dedicated load balancer or firewall.

But the simpler fix is to avoid using SSL content on pages that don’t need it. It’s rare to find a good reason to mix SSL and non-SSL resources on a page. Looking at waterfalls for Obama’s pages, I’m seeing that a lot of the SSL resources are fonts and visual assets. There’s no good reason for this, as far as I can see.

(Why do SSL resources accidentally end up on pages where they don’t belong? Here’s how: Using unnecessarily secure tags is a simple enough mistake to make. Some third-party tags are secure by default. Site devs will include the default tag in page templates, so that it’s automatically used on every page of a site, even the non-SSL pages. There are non-SSL versions of these tags. Ideally, devs should use the SSL version on secure pages, and the non-SSL version on the rest of their pages.)

Finding 3: At a glance, responsive design doesn’t confer a significant mobile performance advantage on Obama’s pages. However…

This could (and probably is) due to the payload, domain, and security issues already discussed. From a usability perspective, it was interesting to note that, when I looked at both candidates’ sites on my phone, the images and buttons on barackobama.com were crisp and clear. In contrast, looking at the splash page for m.mittromney.com, the primary image is on the muddy side and the call-to-action button is extremely low-res. It looks like someone in Romney’s camp has taken a hardline approach to optimizing this page without worrying about the impact on image quality.

Questions

These findings raise more questions than they answer, but I think they’re interesting — and in the right circles, inflammatory :) — questions:

1. Does site speed actually matter in this scenario?

We know that even small differences in load time — as little as 250 milliseconds, according to Google — can be a competitive differentiator in ecommerce. We know that the speed of a site has a definite impact on customer satisfaction and brand perception. Does this hold true for politics as well? Can a swing voter be swung on web performance?

2. Can we extrapolate anything about the candidates’ stances on issues like site security and mobile design?

Depending on whom you ask, responsive web design is the saviour of mobile performance, and anyone who’s not jumping on the wagon is hopelessly out of touch. I can imagine how tempting it is for Obama supporters to use this as an analogy for Romney. What kinds of assumptions, if any, could you make about the crop of SSL content on Obama’s pages?

3. And the most important question: Who will win the election?

Based on the following logic (which seems as scientifically solid as many of the polls I see):

  • Ohio will determine the election
  • More independent voters use desktop than mobile
  • The Asheville NC test results correlate roughly with performance in Ohio (Asheville is two miles closer to Columbus than Dulles is on Google maps)
  • Obama is about 20% faster from this location

I predict an Obama win.*

*Margin of error 96% 1 time out of 25.

Related posts:

WebPerfDays follow-up: 36 questions about web performance tools, measurement, and best practices

Today’s post is kind of like those blog posts where people answer random questions about themselves. Just a bit geekier. :)

When I was at WebPerfDays last week, I was walking past the ubiquitous board of post-it notes and started thinking about the fact that at every conference, so many of those questions and ideas go undiscussed. So I decided to snap each one and have made my best attempt at answering every question.

Before you read these, a big caveat: With many of these questions, the answer really depends on the situation. I’ve noted this where it’s particularly relevant, and given general answers everywhere else.

1. Why are Google Analytics so slow? Watch page loads – it’s always waiting on Google.

If you use the right script it should not affect your users.

2. How much impact can be made by optimising CSS selectors and does this have a trade off with maintainability?

Very little for most pages.

3. Do mobile networks do their own TCP optimization to devices?

Yes.

4. Can performance + webfonts coexist?

Yes.

5. Experiences with Android web driver?

None.

6. Will HTML5 make things worse?

Yes.

7. How build and deployment shapes our software architecture at thetrainline.com?

Don’t understand this question. Anyone?

8. GeoDNS versus Anycast Servers

GeoDNS.

9. Can we make significant improvements in website performance without further improvement in browsers or protocols?

Yes.

10. ASP.Net server side instrumentation, analysis, and interpretation

Ask Richard Campbell.

11. Pros and cons for loading 3rd party JavaScript in head/footer best practices

Footer, if possible.

12. Should I use Google or Microsoft CDN for my JQuery, JQuery UI, etc. references or host them myself?

Either. Shouldn’t matter much.

13. Offline and embedded web COAP

Nice and lightweight — functional enough?

14. Large companies and embedded web

Complicated.

15. Have there been any studies of how good Time To First Byte is as an approximation for true server time? e.g. DNS, connection time

Yes: here and here.

16. API performance

Important.

17. GEO – Distributed

Can be complicated. Do you really need it?

18. Responsive image format

There’s a group working on a standard.

19. Using web analytics to identify (and fix) web performance issues

I love it. See here, here, and here.

20. Metric-driven development

Does any other type warrant discussion?

21. No SQL Scale and Speed and survey of options

Not my area of expertise.

22. CDN vs. web proxy in some datacenters

Can someone clarify this question?

23. What is the best way to get customers interested in web performance when their website or web apps don’t sell stuff?

Find a metric they care about (e.g. productivity, retention, bandwidth), and I bet it connects back to speed.

24. Tonnes of great stuff at Velocity – what do you start with?

Measure your performance on WebPagetest. Iterate and measure again.

25. How many users is enough? When and what to deploy web KPI – new tech release schedules and users based?

More more more.

26. MVC JavaScript (backbone)

It works.

27. How many data centres?

Do you need the complexity of two, or is it for your ego?

28. Continuous delivery versus bureaucracy

Easy to say, hard to do. Do A/B development: put together a team of both and see who enchants the business.

29. Do the radios in 3G dongles behave the same as phones (from sleep to work)?

I would assume yes.

30. Fave perf. tools and what is missing?

See Steve Souders’s post for tools. Missing: Resource timings.

31. Discussion on what is the best timer to measure onload, first interaction, anything else? Multipage times

All of it. Correlate to biz metrics.

32. What are the best RUM tools?

Boomerang is a good place to start.

33. Has anyone used lo-fi RUM (stopwatch timing etc.) to validate their collected metrics/KPIs?

Yes. I should write a post about it.

34. Best way of building performance testing into CI build?

Script WebPagetest or HTTPwatch or Web Page Speed.

35. Continuous performance analysis tools

Take a look at WPT Monitor.

36. Should we expect FEO as standard feature of our CMSes?

Yes. Basic features by 2014. CMSes are usually two years behind.

For the sake of expediency, most of these answers are really short. If you want elaboration on anything, let me know in the comments.

Related posts:

O’Reilly webcast: Mobile web performance trends and predictions [SLIDES]

Earlier this week, I had the privilege of participating in an O’Reilly webcast as a preview for my session at Velocity EU next week, where I’ll be presenting the results of Strangeloop’s first annual state of the union on mobile ecommerce performance (not to be confused with our quarterly SoTU on desktop performance, which was most recently released last week).

If you missed the webcast on Tuesday, here are the slides:

We covered a lot of ground. Here’s an overview:

  • 3-5: The mobile market
  • 6-12: Case studies: Why is faster better?
  • 13-23: Measuring mobile performance: tools and tips
  • 24-29: Visualizing performance
  • 30: How does data work on a mobile network?
  • 31-91: Best practices in action
  • 92-102: Mobile caching (it’s a good thing)
  • 103-111: Evolution: Where mobile is heading
  • 112-115: Sneak peek: 2012 State of Mobile Ecommerce Performance

As you can see, I saved the preview for near the end. But I saved the best slide for last… my costume for our “Celebrity Dress-Alike Day” at Strangeloop on Tuesday:

If you have any questions about these slides, drop me a note in the comments. And if you’re going to Velocity next week, I hope to see you there!

Related posts:

New findings: Ecommerce sites are 9% slower than in 2011

Two years ago, when Strangeloop started tracking the load times of 2,000 top North American ecommerce sites, we had a hunch we’d spot some interesting trends over time. We did not expect, however, to see that pages are continuing to get slower rather than faster. Yet according to the Fall 2012 release of our quarterly Ecommerce Page Speed and Web Performance State of the Union, which came out today, that’s exactly what’s happening.

Not only are pages slower, they’re dramatically slower. Since November 2011, when we last tested these sites, the median home page has taken a 9% performance hit, with load time increasing from 5.94 seconds to 6.5 seconds. This flies in the face of conventional belief that, thanks to faster browsers, networks, and devices, the average end user is enjoying a premium online experience. This is clearly not the case. As user expectations continue to grow, the gap between expectations and reality continues to widen.

Ecommerce Page Speed and Web Performance State of the Union [Fall 2012]

These graphics (higher res version here) illustrate a few of our  findings. I encourage you to download the report to read the rest. Without giving it all away, here are a few highlights:

  • Internet Explorer 10 served pages faster than other browsers, most notably 8% faster than Chrome 20. We tested each page across a number of browsers, including the latest versions of IE, Chrome, and Firefox. It’s important to bear in mind that these were simple tests that didn’t take into account the many nuances of browser performance (which is discussed further in the report), but we considered these results interesting enough to share.
  • Top sites are 10% slower than the pack as a whole. While the median site took 6.5 seconds to load, we saw even poorer results when we looked at the top 100 sites (ranked by revenue and profitability), with the median Alexa 100 home page having a load time of 7.14 seconds. Check out the report for our thoughts on this.
  • Many sites are still not following core performance best practices. We found that 30% of sites tested did not use compression, and 12% did not use keep-alives. As I’ve talked about elsewhere, these two fairly simple techniques can yield big results, including up to 52% improvement in start render time.

Why you should care about these findings

To my knowledge, Strangeloop’s state of the union reports (which we’re now releasing on a quarterly basis) are the only ongoing surveys that measure performance from the perspective of real users. By using WebPagetest, we can simulate performance across browsers and realistic latencies, and get a real-world look at how websites actually behave. It’s easy for site owners to fall into the trap of thinking that their sites are fast for everyone, because site owners are typically seeing benchmark tests run out of datacenters.

I want to emphasize that reports like this one are not a substitute for the real user monitoring you should be performing on your site on an ongoing basis. Instead, consider it a snapshot that we can collectively hold up as a mirror of big-picture ecommerce performance.

As always, I welcome your feedback and questions.

Download the report: State of the Union: Ecommerce Page Speed and Website Performance [Fall 2012]

Download a high-res version of the infographics above (and feel free to re-post): Poster: Ecommerce Page Speed and Website Performance [Fall 2012]

Related posts: