While sitting at one of those executive-takes-out-prospective-client-for-a-very-expensive-steak-dinner gigs during Velocity this year, I was struck by the observation of my dinner companion. It was his first time at Velocity and he was surprised and disappointed that CDNs were not a bigger part of the discourse.
“How can ‘use a CDN’ be one of the top three performance optimization techniques recommended by all of the tools, and yet have no visibility at Velocity? There were no sessions about CDNs, and none of the big vendors showed up. This conference is missing a key component.”
I have been thinking about this since Velocity, and I agree that our community should try and wrap our arms around the CDNs and involve them in the debate.
Why should anyone pay a premium for dynamic site acceleration?
Most of the big-name CDNs offer their own dynamic site acceleration (aka “whole site acceleration”) products: Limelight Site, Cotendo, Akamai DSA and WA, and CDNetworks Application Acceleration. I keep getting questions about how these products help and why companies should pay a premium for these services.
I have tried in vain to find a simple description of how these technologies work and the benefits of each. Here is my best attempt.
First, let’s define the space.
In my opinion, the key defining element of dynamic site delivery is the idea that the request for the page AND all of the page objects go to the CDN. This is very different from small object CDN delivery, in which the request for the page is sent directly to your datacenter (what the CDNs call “the origin”) and the browser requests objects (CSS, JS, SWF, images) from the CDN.
After this defining element, which is shared by all products in this category, the world gets more complicated and confusing.
In my simple world, the market now splits into two major categories:
- Those that can cache HTML at the edge.
- Those that can cache HTML at the edge AND provide some level of acceleration for delivering the HTML from the origin to the edge, in case it can’t be cached at the edge.
Products that can cache HTML at the edge
Both categories can cache HTML at the edge. This means that when you make a request for a static page, the closet’s POP (point of presence) will respond with the HTML, without having to go back to the server.
If we look at this from a waterfall perspective, HTML caching will help make the orange bar smaller (initial connection time), make the green bar smaller (time to first byte), and perhaps help the blue bar (content download), but probably not do much for the HTML only, unless you were doing no compression before using the service. I’ve also seen the DNS lookup get better when these services are used, but that’s not because of HTML caching: it’s because these services have a pretty robust DNS infrastructure that ultimately helps DNS lookup times.
All this is assuming that your user is close to a POP and your HTML content is in the CDN cache (both of which are assumptions you can’t always make, especially for long tail content).
It is important to pause here and note the following:
- Most of the customers I interact with have very dynamic pages and their content cannot be cached at the edge, notwithstanding the amazing marketing done by CDNs around archaic technologies such as edge site includes (ESI).
- As I have mentioned in the past, most customers have a front-end problem and this feature only addresses the back-end problem. However, every second counts, so if you can take advantage of dynamic/whole site acceleration, then you should investigate it.
- The benefit of static HTML caching will be enhanced by latency, packet loss and jitter. Your users in, say, rural China will get more of a benefit from this than your users in the US, if your server are in the US.
Products that can cache HTML at the edge AND provide some level of acceleration for delivering the HTML from the origin to the edge
Now imagine a world in which the HTML must come from the origin servers, i.e. your pages are dynamic and you can’t cache them at the edge. If your pages are dynamic and you can’t cache them, then the performance benefit of sending your traffic to a location very close to the user and then over a the CDN’s network to the origin server is threefold:
- The CDN can dramatically reduce the DNS lookup time (remember you can also get this by looking at a DNS service like Ultra DNS.
- The CDN can keep connections open between its POPs and save on the initial connection time.
- The CDN can make tweaks at the network layer (TCP/IP) to ensure the highest rate of data transfer and, in the case of packet loss, ensure efficient recovery. This will result in faster content download time for the HTML.
In other words, the CDN can build a really big robust highway between two locations anywhere in the world and send data quickly between POPs. If we look at this from a waterfall perspective, when comparing this to going straight to the origin for the HTML, we’ll see improvements in the teal bar (DNS lookup) and orange bar (initial connection time), and we should see improvements in the blue bar (content download), since the service promises to deliver HTML over their network more quickly.
As I mentioned above, the benefits of dynamic site acceleration are most visible in a waterfall taken from a location with high latency and other network-related problems. For example, here is a page tested from San Jose versus a page tested from China.
Obviously, content download is going to be worse in Asia, but the TCP/IP tweaks that dynamic site acceleration can provide will help much more in the second waterfall than in the first.
Other features that don’t have much to do with acceleration but help sell the product
Bundled into dynamic site acceleration are a number of other ostensible value-added services:
- Offload: Offloading all of your traffic to a CDN means you buy less boxes and have less hassle. Ensure you understand the cost-benefit here, as this is very expensive hosting.
- Small object caching: This is always bundled into the DSA pricing. Make sure you know how much the DSA part is going to help vs the small object part. On its own, small object can be 10 times cheaper.
- HTML Compression: Obviously this helps with performance, but you can already get this for free on your servers. If you run a big shop, you should already have an ADC with this capability.
- Availability features: Ability to show a pulse when the customer’s route is blocked or your servers are dead.
- Security: Features like DDOS and web application firewall are sexy, but make sure they don’t duplicate what you already have.
- PCI compliance: Ensure you understand what this means and if you need it.
So what to do with this information?
Understand why you need whole site delivery. Buying it for security, scalability and offload is a very different decision than if you want it for acceleration. It is very common to lump the benefits of what small object delivery and dynamic site delivery will do for you into one category. You need to separate them, because dynamic site delivery can be up to 10 times more expensive.
- Identify why you might be interested in whole site delivery. If acceleration is an important factor, proceed to the next tasks.
- Determine if you can cache your HTML for any meaningful period of time (3+ hours). If you can, then try it yourself and then compare the benefit of a dynamic CDN.
- Get waterfalls from different locations using Webpagetest and check out the HTML bar (usually the first bar) and see if reducing the DNS lookup time (teal bar: assume 50%), the initial connection time (orange: assume by 60%), the server think time (green: assume 80%, only if you can cache your HTML)) and the download time (blue: assume 5% in North America and Europe and 50% in other parts of world) actually brings you a benefit. (I’m making a broad assumption that you already have compression turned on. If you don’t, you should.)
- Research the different companies, or email me. (I know them all and would be happy to help.)
- Pick a few to try. Make sure you try both small object and dynamic site delivery.
- Test each vendor yourself using an open-source tool likeWebpagetest. Be wary of the tests the vendors send you as they have often gamed the system by putting test machines next to the edge caches and/or kept things in cache much longer then they would in the real world.
I’ll be honest. I remain a skeptic. As an acceleration play, I see the cost of dynamic site acceleration far outweighing the benefit. That’s a lot of money to pay for the small incremental performance gain you get on the HTML for most sites, especially if you’re only in North America and Europe. There are some benefits of DSA, though, in areas of high latency, and for fixing general network problems.
I do see a lot of potential for this sort of technology. I think it has some promise, but there’s been very little innovation in the last 3+ years. Nothing exciting has come out since Netli.
I’d like to be more convinced, so if anyone has real-world performance data with compelling evidence that the performance gain is significant enough to be worth the price, I’m all ears.