It's Just A Little Caching Trick [Update]

Friday pro-tip. If you’re a “View HTML Source” cowboy like me, you may have noticed odd, cruft-full CSS or JavaScript links like this:

While it would obviously be slimmer if you just wrote style.css as opposed to style.css?ver=2009-10-04, there’s a neat little trick to that tail cruft. See modern browsers have rather effective built-in caching protocols which assume that files aren’t changed unless they’re told otherwise. Since the browser reads style.css and style.css?ver=2009-10-04 as two different files, we can effectively let users browser know when we they should fetch our updated CSS files. Which is when we change that tailcruft. So it makes sense to append the date of your most recent CSS file change to that URL, and you’ll be sure visitors see what you’re seeing.

[Update]: Apparently it’s not such a good trick after all. See the comments.

11 thoughts on “It's Just A Little Caching Trick [Update]”

  1. Lasse Brandt says:

    No no no!!! This is wrong wrong wrong!!! Sorry 🙂
    How an element is cached, is controlled in the HTTP header that the webserver is sending with the file. There is a series of headers for cache control (ie. cache-control, Expires, Last-Modified and Etag – please use these and ONLY these for controlling the cache.
    Simply changing the filename is a very inefficient and very wrong way of trying to control caching behavior in client (browsers) and proxy servers, that could easily result in wrong or “bad” caching.

    1. Joen says:

      You do realize that I didn’t mean the date query string to be updated every day, but only every time the file is actually changed, right?

      How is this a worse caching than sending HTTP headers? (And how can I send PHP headers in a non-php CSS file?)

  2. Lasse Brandt says:

    Joen: You do realize that I didn’t mean the date query string to be updated every day, but only every time the file is actually changed, right?How is this a worse caching than sending HTTP headers? (And how can I send PHP headers in a non-php CSS file?)

    First of, your trick isn’t telling the world _anything_ about caching at all – you are simply renaming the file on every update. That isn’t caching – that’s trying to trick the world.
    The problem is that browsers and proxy administrators don’t have a clue what you are trying to do and therefor is forced to “guess” a solution/action. If you want control of how things are cached out in the world, simply send the correct http headers.
    There are several ways of sending custom http headers, one is a scripting language (like you said, php) another is to tell your web server which headers should be send with different files.
    Another thing, the day your website gets digged, I promise you, the first thing the administrator would do is to place a reverse proxy (caching server) in front of your webserver to take all the hits – and when he figures out the you are sending ?something to try and control the cache, he will (and I know this for sure :)) shoot you in the kneecaps! 🙂

    1. Joen says:

      Well, I gotta admit I trust your experience more than my own in this case. Thanks for the update!

  3. I’m with Lasse on this one Joen. That neat trick isn’t so neat after all. It goes without saying that you should always use the correct headers for all your resources.

    1. Joen says:

      That’s how I learn 🙂

  4. Anders says:

    Actually, that is a neat trick under certain circumstances. But you need to know WHEN, WHY and HOW. I use this “trick” myself extensively.

    Most webservers will distinguish between so-called “static file” file handlers and dynamic handlers. A dynamic handler is what serves files such as .aspx or .php. Static file handlers serve “all the rest” which include .css, .html and all image files.
    Most webservers will set “no-cache” on a dynamic handler which means that your browser will request and re-request it every time you go to it.

    Those same web-servers will set a header which is effectively a “re-validate” header on static files such as .css files. Re-validate means that the first time you get that file, your browser also gets, in the header of the response, a date on which the file was last modified. Your browser will store that date togehter with that file in its cache. When you re-request that file your browser will include that date in its request header. The handler to which the webserver passes the request will evaluate that request header including the date and decide – broadly speaking – on either of two possible responses. Option 1, it will send a status code 304 as part of the response header. In that case the response body will be empty. 304 tells your browser that the file you have is up to date, and without having received the body of the response your browser will use the file it has in its cache. Option 2, it will send status code 200, which means “here comes the body”. Your file is out of date, and instead of using your cache your browser use what’s in the body and caches that for next re-request.

    For instance, in the case of loading my browser used 720 milliseconds (nearly a second) receiving a 304 on the URL

    If I had received a 200 it would have taken longer. In fact, here on my machine it would take slightly more than twice that: 1 second 450 milliseconds. If the file had been bigger it would take much more.

    So, obviously getting the 304 is quicker. Which is exactly why browsers cache stuff.

    There is however one way of making it even faster: Set a “valid-to” header on the content for example saying this file here is valid till the year 2020. In that case your browser will know that what it has in its cache is valid, and won’t even bother re-requesting it. In that case you will probably get it in less than 100 ms. Yeah 🙂

    Note that in order to do this you need to manually set the response header. To do this you typically need either administrative control over the webserver itself, or you can set it programmatically in a dynamic handler.

    By setting the valid-to date to sometime in 2020 we shave off 600 milliseconds potentially on a bunch of requests. So far great!

    But what happens if you – perhaps against expectations – actually do have to change your css file before the year 2020.

    Well, you can add some new random query parameter to your request. This will change the URL, which will force your browser to re-request the file. Hey presto!

    The point here is that when you know what you’re doing and have control over your headers, there is in fact a very valid point to this trick.

    Adding the the random querystring does NOT affect the response header. It only affects how your browser handles the request, i.e. through cache or re-validate.

    And FYI, the admin of your webserver will NOT shoot you in the kneecap. Not at all 🙂

    If you want to know more about caching and see it in action, so to speak, as I did with references to the URL from your website, Fiddler is everything:

  5. DN says:

    Huh. I dunno. The arguments against weren’t actually very convincing. Got anything more specific or maybe clearer?
    I understand wanting headers to be correct, but this still seems like a fine fail-safe in conjunction with that. (And, really, only server admins can speak comfortably about editing headers and determining response codes. For the normal web designer, those are arcane esoterics.)

    1. Lasse Brandt says:

      I cant believe how much I disagree with you 🙂

      “And, really, only server admins can speak comfortably about editing headers and determining response codes.”

      – what the?! You gotta be kidding me!
      How on earth should the server administrator know how to setup caching correctly for your application that you build? I’m sorry – as a server administrator I would always love to help out, but en general, I don’t know much about the application that _you_ build.
      It’s your application, you know how the files should (or should not) be cached. If the standard web designer doesn’t know or care about how caching works, then he is doing it wrong.

    2. DN says:

      Haha, okay, Lasse, guess I’ll grant that. What I mean is, there’s a kind of continuum of familiarity between back-end know-how and front-end know-how, and those erring towards the front end let their eyes glaze over when you start talking about headers and server configuration, just as back-end developers tend to stare at the ceiling when you talk CSS and pretty markup and whatnot. By ‘normal web designer’ I meant the font-end folks (front-end leaning, even if they program).
      You may be right that they’re doing it wrong, but I’d put money down that that’s the state of things, even with the educated and conscientious ones. What I’d like to know, in this case, is why the designer shouldn’t embrace this as a part of their resource management. It’s intelligible to them (the ?vers=x carries meaning) and doesn’t cost the server anything extra, perhaps even less if they set a way-forward expiration date like Anders suggests, and serves the exact same file while allowing for an update by changing the version number.

    3. DN says:

      (I’ll gladly take an answer, by the way, it just seems that it hasn’t surfaced yet, besides it somehow angering server admins. But if they [the designers or their back-end folks] are supposed to be in charge of how they serve files at this level, as your reply suggests to me, then I fail to see why this technique should grate on the nerves of the admins.)

Comments are closed.