Speed up page loads

Posted by ezmobius Fri, 29 Dec 2006 23:51:00 GMT

Here is a little tip that can drastically speed up your page loads for pages that have a lot of static assets. Especially for pages with tons of little images.

Most browsers will only open up 2 concurrent connections per cname. this means that if all of your assets are being served from http://example.com and you have a lot of little images and scripts on the page, the clients browser will only open two connections to example.com and pipeline or use those two connections to download all the assets on the page.

By using a wildcard subdomain or manually setting dns so that you spread out the static assets over a few different subdomains, you let the browser open two connections per subdomain so all the assets will download in a more parallel fashion. This can be the difference between your page loads lagging at the end while they load up all the little assets and having the page snap into place and seem a lot quicker to the user.

So if we use a simple little view helper method for all of our image urls we can spread the load out by faking the browser into thinking it is connecting to multiple servers. For example if we serve all our images from these subdomains:

asset1.example.com
asset2.example.com
asset3.example.com
asset4.example.com

This will give us 8 concurrent connections from the browser to the server for static assets which dramatically decreases page load time. The thing to watch out for is that you always want to serve the same asset from the same subdomain or else you defeat browser caching and won’t gain anything from this trick. So we will use an Zlib hash of the asset url modulo 4 to choose a subdomain. Here is a simple helper:

require 'zlib'

  # balance images across many domains to force the opening of more connections  
  # updated to use Zlib.crc32 instead of md5 as per
  # comment from David
  def balanced_asset_url(asset)        
    idx = (Zlib.crc32(asset || "error" ) % 4) + 1
    %!http://asset#{idx}.#{request.domain}#{asset}!
  end

Then use it like this:

<%= image_tag balanced_asset_url('/images/foo.png') %>

By hashing the asset path we make sure that each time this helper is called for the same asset it will always return the same subdomain.

This technique is most useful when you have many objects on a page that need to make an additional http request each to render. By tricking the browser into making more concurrent connections when fetching assets we can speed up our page load times and make our sites seem more ‘snappy’

I’m sure you could make this into a nice little plugin or integrate it into the normal image_tag helper if you so desired. This is just an illustration.

Tags  | 15 comments

Comments

  1. David Felstead said about 1 hour later:
    Hey Ezra - nice idea! I saw a recommendation from a MS or Google guy (can't remember which) saying that based on his experients 5 domains was optimal, from memory. Also, can I suggest using Zlib.crc32 over Digest::MD5.hexdigest - it is almost an order of magnitude more efficient (hope Zed isn't watching my crappy statistics here :) [~] davidfelstead$ time ruby -rmd5 -e "250000.times do Digest::MD5.hexdigest('snorkelfish.gif') end" real 0m1.629s user 0m1.593s sys 0m0.016s VS [~] davidfelstead$ time ruby -rzlib -e "250000.times do Zlib.crc32('snorkelfish.gif') end" real 0m0.255s user 0m0.240s sys 0m0.008s Or even just use String#hash, that's twice as fast as crc32 again. I haven't checked the distributions, but for crc32 and MD5 I will wager it's more than good enough, not sure about string#hash. Obviously it's not going to make a huuuuuuuuuge difference in performance either way, but why waste CPU cycles?
  2. David Felstead said about 1 hour later:
    Yow, lost my line breaks... Ooops.
  3. Ezra said about 2 hours later:
    Thanks David, I updated the article to use Zlib instead of md5.
  4. Thorsten von Eicken said about 6 hours later:
    Ezra, have you actually tested this? I doubt this produces an improvement that is worth the hassle. HTTP 1.1 not only supports persistent connections, it also support pipelining. This means that over those two connections the browser can send the requests back-to-back before even receiving a single response. This completely eliminates the dead-time between getting back the last packet of the previous request and then sending the next request. Instead, all the requests are queued at the server, which spits out the responses as fast as TCP will allow. With two connections I expect that the network bottleneck is pretty much maxed out. Yes, 8 may eek out 10% or 20% more, but that won't make much of a difference to the user. Anyway, it would be interesting to see some actual packet traces of what the various browsers do. Aaahhh, the ever returning wish of endless time....
  5. Thorsten von Eicken said about 6 hours later:
    Wow, I thought the "preview comment" link didn't work, but no, it does work, it just takes about 30 seconds...
  6. David Felstead said about 7 hours later:
    Thorsten - check out this analysis of optimizing page load time, this fellow's already done all the hard yards for us. The cliff notes is that pipelining is actually disabled by default for all browsers bar Opera, and spreading content over multiple hostnames is a major improvement (if pipelining is disabled, which it most likely is): http://www.die.net/musings/page_load_time/
  7. Dan Kubb said about 18 hours later:

    Ezra, why not cache the crc32 value in a Hash class variable?

    Also, here's another article that provides graphs that visually illustrate how browsers download the HTML and assets serially. Once you see the graphs you'll understand better why this technique works so well.

  8. Joe Ruby said about 20 hours later:
    Yeah, I'd like to see some performance numbers on this...

    I have sites with lots of images on a page and they always seem "snappy" enough.
  9. Ezra said about 21 hours later:
    This technique does indeed decrease page load times. I have tested it thouroughly for a clients site. The more static assets on the page the more of a difference it makes. On a page with 231 small images it yielded a 20-30% improvement in the load time of the home page. In this case the home page is page cached so I wasn't very worried about the efficiency of the helper method as the home page only gets rendered out to cache once evey 10 minutes. @Dan- yes there are plenty more things you coudl do with this technique code wise. Maybe soemone will make a plugin. This implementation was just to start a discussion and to see how other people's apps fare when they try this. It is not a hard change to make to an app and the perf improvement is nothing to laugh at. @Thorsten- This blog is still hosted at rimu hostiong on a tiny vps so it is quite slow. One day when i get some free time I will finally move it to engine yard but it hasn't been a priority yet. Once people have pretty much finished their apps and are looking to eek out any last performance improvements they can get, techniques like this and a few other that I will post articles about can help to get those last few bits of latency out of the setup. Every little bit counts. This technique might not buy you much if you don't have a ton of objects on the page that have to be fetched. but as the number of objects per page goes up, this trick helps more and more.
  10. Joe Ruby said 2 days later:
    How in the world do you end up with TWO-HUNDRED-AND-THIRTY-F'ING-ONE IMAGES ON A PAGE?!? ;P
  11. Kevin said 7 days later:
    Great idea, I was reading up on this about 3 mos ago on evolt I think?!!? My question is this, how can you set this up to work in a development environment? I mean its not practical to put images and assets ( that are under construction ) to various domains when building. Would something like this work if you aliased ports on say localhost?
  12. topfunky said 10 days later:
    Mac OS X doesn't support wildcard subdomains easily, but I did this on /etc/localhost
    127.0.0.1 asset1.localhost asset2.localhost asset3.localhost
    
    I also tweaked it a bit to add in the current port so it works in development.
  13. Kevin said 10 days later:
    topfunky... nice, this should work out nicely. :)
  14. Bob said 14 days later:
    Remember that 4 hosts instead of 1 means 4 DNS lookups instead of 1. How many of us have been subjected to shitty DNS servers that take forever to resolve? Not that this alone should stop you of course - it's still a great technique. Just don't forget to factor it into to the balancing act that is web development.
  15. Garry said 127 days later:
    Thanks for the tip Ezra.

(leave url/email »)

   Preview comment