mobile web performance

Review: Blaze Mobile performance measurement tool

Mobile website performance measurement toolOur industry has been in desperate need of a solid tool that measures real-world mobile website performance. Blaze Mobile — which is based on the Webpagetest framework — may be just what we’ve been waiting for.

The tool is still in beta, and the folks at Blaze has a list of known issues they’re working on. It’s up to the rest of us to take it for a test drive, see how it performs, and offer our feedback.

I thought it would be interesting to start by testing the current top 5 sites in Keynote’s mobile commerce index and benchmark Blaze’s results alongside Keynote’s.

Test parameters

Blaze lets you test for iPhone and Android, so I tested on both those platforms.

You can choose between one, two, and three test runs per URL. I opted for three runs. The averages are below.

The tool also lets you capture video results, so I tested with and without that option. As discussed here, capturing video can have a negative impact on Webpagetest’s page load times. Blaze has stated that their tool has the same issue, so I wanted to see how pronounced the impact of video capture is.

Comparison: Keynote and Android

Website Keynote Blaze: Android
(with video)
Blaze: Android
(no video)
Strand.com 2.98s 2.3s 2.14s
Barnesandnoble.com 4.53s 1.06s 1.12s
Walmart.com 4.41s 2.27s 2.82s
Dell.com 4.35s 1.66s 1.5s
Victoriassecret.com 5.94s 2.4s 3.45s

Comparison: Keynote and iPhone

Website Keynote Blaze: iPhone
(with video)
Blaze: iPhone
(no video)
Strand.com 2.98s 3.83s 2.67s
Barnesandnoble.com 4.53s 3.43s 1.73s
Walmart.com 4.41s 6.29s 3.73s
Dell.com 4.35s 3.72s 4.63
Victoriassecret.com 5.94s 5.5s 5.05s

What patterns emerged?

Even allowing for fluctuations caused at the network and delivery end of things, there are some interesting patterns:

  • Across the board, Android load times were significantly faster than Keynote and iPhone load times. In the case of the Barnes & Noble site, the Android load time was about 400% faster.
  • Keynote’s load times surprised me by being, overall, slower than Blaze’s. I had expected that, because Keynote takes their measurements at the highest possible network speeds available at the time of the tests, their results would be faster.
  • Video capture dramatically slowed down iPhone results, in some cases making them almost twice as slow as load times tested without video capture. The effect of video capture on Android results was much slighter.

My review

Usability

I found Blaze Mobile extremely usable. The UI is really clear, and considering the fact that it can only run one test at a time, I was impressed with how fast it was able to deliver results (though that could change if it catches on).

Results page

The test results pages are easy to read and include a waterfall and HAR file. Right now, the results page only shows you the load time and page size.

Suggestions for future development:

  • It would be great to also see time to first byte and start render. Sure, you can find this on the waterfall, but waterfall interpretation isn’t in everyone’s skill set. (If you want to add it to yours, here’s a beginner’s guide to waterfall charts.)
  • I also like Webpagetest’s option to view your test results as a filmstrip. This would be a good feature to see here as well.
  • It would be good to be able to run side-by-side tests.
  • I’d also like to be able to export the video file.

Methodology

I like Blaze’s transparent methodology. Keynote provides a bit of this in a footnote on their mobile index page, but I’ve always hankered for more background from them. I’d love to know what devices and operating systems they’re testing with. With Blaze, I get a better sense, not just of how they run their tests, but also the inevitable caveats that come with interpreting the results. As a result, I feel like I can buy in to their results. They feel truthy.

My only major caveat

Until the video capture issue is resolved for the iPhone, I’d consider these results invalid. So make sure you de-select ‘Enable Video Capture’ when you run iPhone tests.

Other than that issue, I’m adding Blaze Mobile to my toolset, and looking forward to where they take it next.

Related posts:

Is web performance optimization a “green” issue?

I live in a city with a bold ambition: to be the greenest city in the world by 2020. I try to do my part. I take the bus or walk to work, my family has only have one car that we use rarely, and we even use the ugly energy-efficient Christmas lights.

When it comes to my work, I also believe that I have a positive impact, and it makes me feel good when green is highlighted in our industry.

Last May, Steve Souders came up with a list of predictions about the future of web performance optimization. Prediction #3 was this:

“Finally we’ll see studies conducted that quantify how improving web performance reduces power consumption and ultimately shrinks the web’s carbon footprint.”

When I first read this, my immediate reaction was, “Wow. That’s a really cool, bold expectation.” Ever since, I’ve kept my eyes open for anything I can find about the impact of performance on energy use. Maybe I’m hanging out in the wrong part of the internet, but I haven’t had much luck.

Analyzing the trade-off between performance optimization and energy use is a huge challenge. It’s not enough to say, “We’re delivering smaller pages and fewer/smaller objects, therefore using less energy. Problem solved!” We also have to take into consideration:

  • the energy consumed by new machines added to the network to automatically transform web pages,
  • the impact on servers as more is offloaded in the network,
  • the change in use of a content delivery network,
  • the change in energy consumption at the client level based on increased or decreased CPU use,
  • the change in energy consumption of the user as they browse more pages and buy more,
  • etc., etc.

So while it would be great to see a simple “If _____, then _____” equation for calculating performance/energy savings, perhaps it’s not a huge surprise that not a lot of people have tackled this big hairy question.

A (very) few people have tried to tackle this question. These are the best articles and blog posts I’ve come across:

Steve Souders: How green is your web page?

This blog post is almost three years old, but still worth reading. Huge kudos to Steve for this exercise in quantifying how specific performance improvements (he uses Wikipedia as an example) could lead to energy savings. When I imagine helpful equations for calculating performance/energy benefits, I imagine them looking a lot like this.

Boston.com: Taking a different measure

Really interesting article that came out last fall about how Akamai is auditing the carbon footprint of its 70,000-server network. It’s a hugely ambitious project, which the company undertook after realizing that 87% of its carbon footprint came from its network operations. This is the kind of data we need. It’s a key piece of the puzzle in figuring out how to quantify performance and energy use.

Fast Company: Is the Internet Sustainable When Everyone On Earth Uses Over 3 Gigabytes of Data Per Day?

Scary quote alert…

“That’ll come to 2,570 exabytes per year for the global population, by 2030. (An exabyte is a billion gigabytes.) The average power needed to sustain such activity would be 1,175 gigawatts. It takes an entire large coal-fired power plant to produce just one gigawatt of energy, so imagine 1,175 of those churning out power just to fuel the world’s data hunger.”

That piece came out right before Christmas, and the fact that it appeared in a relatively mainstream publication like Fast Company is an indicator of the fact that these questions are not going to go away.

According to this report, video is a major bandwidth hog. Streaming/downloaded content from Netflix, YouTube, BitTorrent, and iTunes accounts for 40% of peak U.S. web traffic. (It may be a sad statement that, when I learned that YouTube users are uploading 35 hours of video per minute, my reaction was, “That’s all?”) Video also dominates mobile in pretty much the same proportion.

And that’s just the activity in the United States. Internet users in China log a total of one billion hours online every day, twice as much as Americans. Adoption rates are expected to more than double in the next three years, and not just in China. India, Brazil, Russia, and Indonesia are also poised to see a huge growth in the amount of time their citizens spend online.

So that’s it, the sum total of information I’ve found.

I’m an optimist. I believe that most problems have solutions. My hunch tells me that Steve is correct in postulating “Make your pages faster. It’s good for your users, good for you, and good for Mother Earth.” However, I don’t think we have enough data to confirm or deny what seems obvious. We need more data, and I for one am trying to work with customers to put together case studies that demonstrate positive environmental impact, as much for myself, so I can sleep at night, as for our industry as a whole.

Related posts:

Automating complexity: The future of website performance optimization

The bad news: Performance optimization is going to get harder.

The good news: It doesn’t have to be.

Let me break it down.

Five years ago, when we started Strangeloop, being able to automatically apply the Yahoo and Google performance best practices universally across a website was a monumental achievement. Little did we realize this was just the beginning of a learning curve that has grown infinitely more nuanced.

Browsers began to innovate, which created a new dimension to the problem. Each browser required its own interpretation of the performance rules, with variations existing even between different versions of the same browser. (Domain sharding remains a good example of this, in that it can hurt performance in modern browsers and help performance in older browsers.)

Then last year we went through a process where our customers — having internalized the idea that every second matters — drove us to optimize for user flows and experience, with a hawkish eye on business metrics like revenues and conversions. It became clear that optimizing isolated pages, out of the context of their place in a user’s flow through your site, does not automatically improve performance.

It also became clear that we can’t treat landing pages the same as we treat non-landing pages. As a result, we found ourselves once again changing how we applied performance best practices. (In December, I wrote a case study about this for Stoyan Stefanov’s performance calendar.)

Now we are seeing the next step of performance evolution.

Our collective fixation on the mantra “every second counts” — combined with trying to anticipate user behaviour — is being taken to the next level.

We are moving from the “simple” world of universally applying 15-20 performance treatments across a site to a world in which these 15-20 techniques are applied uniquely to each page of a site according to the following parameters:

  • The user’s past interaction with the site
  • The behaviour patterns of other previous users
  • The specific requirements of each browser type and version

To say that this is an exponential increase in complexity may be an understatement.

Thinking about the evolution of this problem reminds me of one of my favourite childhood books, now long out of print, called The Magic Well by Piero Ventura. In it, a small town becomes overwhelmed by the complexity created by a magic well that produces yellow balls. When I think about the complexity of the web performance problem for app developers and IT departments, these images keep coming up.

Automating web page performance: Exponential complexity

I see the early days when the problem was small but manageable, and today where the problem has become so complex that it is overwhelming.

Are we tackling an impossible problem?

Trying to divine user behaviour in a very complex world is not unique to our industry. I think it’s instructive to look at the evolution of the internet marketing industry. In the old days, you would come up with a good campaign and execute. Then the internet provided marketers with a cheap, easy platform to start testing campaigns, and A/B testing became the norm. This is essentially the stage the performance industry is at right now, as we focus on accelerated versus unaccelerated page results.

As marketing matured, some marketers realized that A/B tests were not good enough and moved to a better solution: multivariate testing (also known as MVT). With MVT, marketers identified a fixed set of variables, created a large number of combinations of those variables, and then tracked users’ preferences to determine which combination was most effective.

The advantage of multivariate testing over simple A/B testing is that it helps website owners take a much more granular approach to figuring out what works and doesn’t work, allowing them to fine-tune a page and squeeze every last drop of value from it. (If you want more background on this fascinating aspect of marketing, Elastic Path has an excellent post about A/B and multivariate testing on their blog.)

Multivariate testing is a critical process when we realize that a problem has become so complex that our intuition is not reliable. We may think we can anticipate how users will interact with a page, but we can’t trust our gut feelings. For a humbling reminder of this, check out Anne Holland’s blog, Which Test Won. Each week, Anne shows you two landing pages and asks you to guess which test won in an A/B test. I like to think I’m pretty marketing savvy, and my success rate on Anne’s blog is only about 30%.

So what does this have to do with website optimization?

I feel that, like marketers, we are at a place in the web page optimization world where we are unable to intuitively make decisions about performance optimizations without applying various combinations of rules for each browser and then measuring real-world results. Marketers use multivariate testing to find the precise combination that works best. We should do the same for performance, to know what combination of performance best practices applied to all aspects of a site (pages, sessions, workflows, browsers, caches, etc) work best.

Obviously, if you apply the generic set of best practices across your site, you’ll get a significant advantage, but this advantage will not be enough in the long run. We know that revenue, conversions, page views and customer satisfaction are not just affected by each second you shave from page load times — they’re affected by every millisecond. In an increasingly competitive online world, the winners are going to be those companies that work to shave off every last millisecond.

The counterargument is this: Why do you need to try hundreds of combinations? Isn’t the best one the one that makes your site the fastest? Marketers don’t have a metric to measure by, other than how well people react in the real world (i.e. there’s no Webpagetest for marketing). Performance is measurable — there are lots of tools. Why not just use the combination that makes your site fastest?

I would answer this by saying that my experience over the last year has suggested that, like marketers, we cannot figure out the best combination in the lab because we cannot predict real user behaviour. I am continually surprised at how bad we here at Strangeloop are at predicting outcomes when we turn various Site Optimizer features on and off.

Admittedly, 20% of the effort gets you 80% there, but you need 80% effort to hit the last 20% — and it’s the last 20% that actually matters. That’s the key part. The easy stuff ultimately doesn’t matter. Using a marketing analogy, it’s like putting the menu bar at the bottom of the page, below the fold. Of course you don’t do that. You, like most people, know that the menu belongs above the fold. But it’s not enough to plunk your navigation at the top of the page. The nuances of its layout and design are what will ultimately determine its success, and these nuances are the hardest thing to figure out.

So how do we take care of this last 20% without losing our minds or blowing the bank?

Multivariate testing works on the internet because the platform provides a cheap way to automate it, and it is possible to incorporate feedback in real time and adjust accordingly.

In a similar way, I believe that transformation-based performance solutions should be able to take page performance and user-based behaviour metrics into account, then perform our own version of ongoing multivariate testing. I also believe that we should be able to perform these tasks quickly and cost-effectively.

I am working to move my company to a place where we can test hundreds of different acceleration combinations on different browsers on any given page, flow, and website — all in an automated way. I see this as the future of our industry.

Related posts:

2011 Web performance predictions for the mobile industry

I recently contributed an article to RCR Wireless that highlights my four big predictions for the mobile industry:

  1. Companies will generate at least 15% of Web sales via their social presence and mobile applications.
  2. Android will become the No. 1 mobile platform, surpassing the iPhone in terms of units and usage.
  3. Retailers will realize that mobile shoppers have a goal-driven “hunter” mentality.
  4. As a result of No. 3, mobile Web performance will become as important as desktop Web performance.

Read the full article here.

Related posts:

Why mobile websites are still disappointing consumers

Last week, I wrote a guest post for Marketing Daily on the fact that, no matter how much press is being given to the success of mobile shopping this holiday season, mobile sites are still not meeting consumers’ expectations. An excerpt:

Fifty-eight percent of mobile users expect sites to load at least as quickly on their mobile devices as on their desktops. You would think this expectation would lead to increasingly faster m-commerce sites, but the opposite seems to be the case.

According to industry benchmarks from Gomez and Keynote, mobile sites seem to be getting slower, not faster. Currently, the average m-commerce site loads in 5.47 seconds. A year ago, that number was 4.73 seconds.

There are a few potential culprits here, such as oversized graphics and poorly optimized widgets, but pointing fingers doesn’t fix the problem. Site visitors want a fast online experience, not excuses.

The rest of the post is here: All I Want For Christmas Is A Faster Mobile Experience

Related posts: